Sample records for statistical model consistent

  1. Evaluating statistical consistency in the ocean model component of the Community Earth System Model (pyCECT v2.0)

    NASA Astrophysics Data System (ADS)

    Baker, Allison H.; Hu, Yong; Hammerling, Dorit M.; Tseng, Yu-heng; Xu, Haiying; Huang, Xiaomeng; Bryan, Frank O.; Yang, Guangwen

    2016-07-01

    The Parallel Ocean Program (POP), the ocean model component of the Community Earth System Model (CESM), is widely used in climate research. Most current work in CESM-POP focuses on improving the model's efficiency or accuracy, such as improving numerical methods, advancing parameterization, porting to new architectures, or increasing parallelism. Since ocean dynamics are chaotic in nature, achieving bit-for-bit (BFB) identical results in ocean solutions cannot be guaranteed for even tiny code modifications, and determining whether modifications are admissible (i.e., statistically consistent with the original results) is non-trivial. In recent work, an ensemble-based statistical approach was shown to work well for software verification (i.e., quality assurance) on atmospheric model data. The general idea of the ensemble-based statistical consistency testing is to use a qualitative measurement of the variability of the ensemble of simulations as a metric with which to compare future simulations and make a determination of statistical distinguishability. The capability to determine consistency without BFB results boosts model confidence and provides the flexibility needed, for example, for more aggressive code optimizations and the use of heterogeneous execution environments. Since ocean and atmosphere models have differing characteristics in term of dynamics, spatial variability, and timescales, we present a new statistical method to evaluate ocean model simulation data that requires the evaluation of ensemble means and deviations in a spatial manner. In particular, the statistical distribution from an ensemble of CESM-POP simulations is used to determine the standard score of any new model solution at each grid point. Then the percentage of points that have scores greater than a specified threshold indicates whether the new model simulation is statistically distinguishable from the ensemble simulations. Both ensemble size and composition are important. Our experiments indicate that the new POP ensemble consistency test (POP-ECT) tool is capable of distinguishing cases that should be statistically consistent with the ensemble and those that should not, as well as providing a simple, subjective and systematic way to detect errors in CESM-POP due to the hardware or software stack, positively contributing to quality assurance for the CESM-POP code.

  2. Landau's statistical mechanics for quasi-particle models

    NASA Astrophysics Data System (ADS)

    Bannur, Vishnu M.

    2014-04-01

    Landau's formalism of statistical mechanics [following L. D. Landau and E. M. Lifshitz, Statistical Physics (Pergamon Press, Oxford, 1980)] is applied to the quasi-particle model of quark-gluon plasma. Here, one starts from the expression for pressure and develop all thermodynamics. It is a general formalism and consistent with our earlier studies [V. M. Bannur, Phys. Lett. B647, 271 (2007)] based on Pathria's formalism [following R. K. Pathria, Statistical Mechanics (Butterworth-Heinemann, Oxford, 1977)]. In Pathria's formalism, one starts from the expression for energy density and develop thermodynamics. Both the formalisms are consistent with thermodynamics and statistical mechanics. Under certain conditions, which are wrongly called thermodynamic consistent relation, we recover other formalism of quasi-particle system, like in M. I. Gorenstein and S. N. Yang, Phys. Rev. D52, 5206 (1995), widely studied in quark-gluon plasma.

  3. Modeling Statistics of Fish Patchiness and Predicting Associated Influence on Statistics of Acoustic Echoes

    DTIC Science & Technology

    2012-09-30

    data collected by Paramo and Gerlotto. The data were consistent with the Anderson model in that both the data and model had a mode in the...10.1098/rsfs.2012.0027 [published, refereed] Bhatia, S., T.K. Stanton, J. Paramo , and F. Gerlotto (submitted), “Modeling statistics of fish school

  4. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Science Inventory

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...

  5. Statistics of the geomagnetic secular variation for the past 5Ma

    NASA Technical Reports Server (NTRS)

    Constable, C. G.; Parker, R. L.

    1986-01-01

    A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.

  6. Statistics of the geomagnetic secular variation for the past 5 m.y

    NASA Technical Reports Server (NTRS)

    Constable, C. G.; Parker, R. L.

    1988-01-01

    A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.

  7. Modeling Statistics of Fish Patchiness and Predicting Associated Influence on Statistics of Acoustic Echoes

    DTIC Science & Technology

    2013-09-30

    data. The Niwa and Anderson models were compared with 3-D multi-beam data collected by Paramo and Gerlotto. The data were consistent with the...Bhatia, S., T.K. Stanton, J. Paramo , and F. Gerlotto (under revision), “Modeling statistics of fish school dimensions using 3-D data from a

  8. Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

    NASA Astrophysics Data System (ADS)

    Qi, D.; Majda, A.

    2017-12-01

    A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with distinct statistical structures.

  9. Data free inference with processed data products

    DOE PAGES

    Chowdhary, K.; Najm, H. N.

    2014-07-12

    Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.

  10. Hydrologic consistency as a basis for assessing complexity of monthly water balance models for the continental United States

    NASA Astrophysics Data System (ADS)

    Martinez, Guillermo F.; Gupta, Hoshin V.

    2011-12-01

    Methods to select parsimonious and hydrologically consistent model structures are useful for evaluating dominance of hydrologic processes and representativeness of data. While information criteria (appropriately constrained to obey underlying statistical assumptions) can provide a basis for evaluating appropriate model complexity, it is not sufficient to rely upon the principle of maximum likelihood (ML) alone. We suggest that one must also call upon a "principle of hydrologic consistency," meaning that selected ML structures and parameter estimates must be constrained (as well as possible) to reproduce desired hydrological characteristics of the processes under investigation. This argument is demonstrated in the context of evaluating the suitability of candidate model structures for lumped water balance modeling across the continental United States, using data from 307 snow-free catchments. The models are constrained to satisfy several tests of hydrologic consistency, a flow space transformation is used to ensure better consistency with underlying statistical assumptions, and information criteria are used to evaluate model complexity relative to the data. The results clearly demonstrate that the principle of consistency provides a sensible basis for guiding selection of model structures and indicate strong spatial persistence of certain model structures across the continental United States. Further work to untangle reasons for model structure predominance can help to relate conceptual model structures to physical characteristics of the catchments, facilitating the task of prediction in ungaged basins.

  11. Nine time steps: ultra-fast statistical consistency testing of the Community Earth System Model (pyCECT v3.0)

    NASA Astrophysics Data System (ADS)

    Milroy, Daniel J.; Baker, Allison H.; Hammerling, Dorit M.; Jessup, Elizabeth R.

    2018-02-01

    The Community Earth System Model Ensemble Consistency Test (CESM-ECT) suite was developed as an alternative to requiring bitwise identical output for quality assurance. This objective test provides a statistical measurement of consistency between an accepted ensemble created by small initial temperature perturbations and a test set of CESM simulations. In this work, we extend the CESM-ECT suite with an inexpensive and robust test for ensemble consistency that is applied to Community Atmospheric Model (CAM) output after only nine model time steps. We demonstrate that adequate ensemble variability is achieved with instantaneous variable values at the ninth step, despite rapid perturbation growth and heterogeneous variable spread. We refer to this new test as the Ultra-Fast CAM Ensemble Consistency Test (UF-CAM-ECT) and demonstrate its effectiveness in practice, including its ability to detect small-scale events and its applicability to the Community Land Model (CLM). The new ultra-fast test facilitates CESM development, porting, and optimization efforts, particularly when used to complement information from the original CESM-ECT suite of tools.

  12. The space of ultrametric phylogenetic trees.

    PubMed

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Pesticide Factsheets

    The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.

  14. A Conditional Curie-Weiss Model for Stylized Multi-group Binary Choice with Social Interaction

    NASA Astrophysics Data System (ADS)

    Opoku, Alex Akwasi; Edusei, Kwame Owusu; Ansah, Richard Kwame

    2018-04-01

    This paper proposes a conditional Curie-Weiss model as a model for decision making in a stylized society made up of binary decision makers that face a particular dichotomous choice between two options. Following Brock and Durlauf (Discrete choice with social interaction I: theory, 1955), we set-up both socio-economic and statistical mechanical models for the choice problem. We point out when both the socio-economic and statistical mechanical models give rise to the same self-consistent equilibrium mean choice level(s). Phase diagram of the associated statistical mechanical model and its socio-economic implications are discussed.

  15. Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses

    PubMed Central

    Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy

    2015-01-01

    Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579

  16. A fragmentation model of earthquake-like behavior in internet access activity

    NASA Astrophysics Data System (ADS)

    Paguirigan, Antonino A.; Angco, Marc Jordan G.; Bantang, Johnrob Y.

    We present a fragmentation model that generates almost any inverse power-law size distribution, including dual-scaled versions, consistent with the underlying dynamics of systems with earthquake-like behavior. We apply the model to explain the dual-scaled power-law statistics observed in an Internet access dataset that covers more than 32 million requests. The non-Poissonian statistics of the requested data sizes m and the amount of time τ needed for complete processing are consistent with the Gutenberg-Richter-law. Inter-event times δt between subsequent requests are also shown to exhibit power-law distributions consistent with the generalized Omori law. Thus, the dataset is similar to the earthquake data except that two power-law regimes are observed. Using the proposed model, we are able to identify underlying dynamics responsible in generating the observed dual power-law distributions. The model is universal enough for its applicability to any physical and human dynamics that is limited by finite resources such as space, energy, time or opportunity.

  17. A Frequency Domain Approach to Pretest Analysis Model Correlation and Model Updating for the Mid-Frequency Range

    DTIC Science & Technology

    2009-02-01

    range of modal analysis and the high frequency region of statistical energy analysis , is referred to as the mid-frequency range. The corresponding...frequency range of modal analysis and the high frequency region of statistical energy analysis , is referred to as the mid-frequency range. The...predictions. The averaging process is consistent with the averaging done in statistical energy analysis for stochastic systems. The FEM will always

  18. Quest for consistent modelling of statistical decay of the compound nucleus

    NASA Astrophysics Data System (ADS)

    Banerjee, Tathagata; Nath, S.; Pal, Santanu

    2018-01-01

    A statistical model description of heavy ion induced fusion-fission reactions is presented where shell effects, collective enhancement of level density, tilting away effect of compound nuclear spin and dissipation are included. It is shown that the inclusion of all these effects provides a consistent picture of fission where fission hindrance is required to explain the experimental values of both pre-scission neutron multiplicities and evaporation residue cross-sections in contrast to some of the earlier works where a fission hindrance is required for pre-scission neutrons but a fission enhancement for evaporation residue cross-sections.

  19. Searching for hidden unexpected features in the SnIa data

    NASA Astrophysics Data System (ADS)

    Shafieloo, A.; Perivolaropoulos, L.

    2010-06-01

    It is known that κ2 statistic and likelihood analysis may not be sensitive to the all features of the data. Despite of the fact that by using κ2 statistic we can measure the overall goodness of fit for a model confronted to a data set, some specific features of the data can stay undetectable. For instance, it has been pointed out that there is an unexpected brightness of the SnIa data at z > 1 in the Union compilation. We quantify this statement by constructing a new statistic, called Binned Normalized Difference (BND) statistic, which is applicable directly on the Type Ia Supernova (SnIa) distance moduli. This statistic is designed to pick up systematic brightness trends of SnIa data points with respect to a best fit cosmological model at high redshifts. According to this statistic there are 2.2%, 5.3% and 12.6% consistency between the Gold06, Union08 and Constitution09 data and spatially flat ΛCDM model when the real data is compared with many realizations of the simulated monte carlo datasets. The corresponding realization probability in the context of a (w0,w1) = (-1.4,2) model is more than 30% for all mentioned datasets indicating a much better consistency for this model with respect to the BND statistic. The unexpected high z brightness of SnIa can be interpreted either as a trend towards more deceleration at high z than expected in the context of ΛCDM or as a statistical fluctuation or finally as a systematic effect perhaps due to a mild SnIa evolution at high z.

  20. Are well functioning civil registration and vital statistics systems associated with better health outcomes?

    PubMed

    Phillips, David E; AbouZahr, Carla; Lopez, Alan D; Mikkelsen, Lene; de Savigny, Don; Lozano, Rafael; Wilmoth, John; Setel, Philip W

    2015-10-03

    In this Series paper, we examine whether well functioning civil registration and vital statistics (CRVS) systems are associated with improved population health outcomes. We present a conceptual model connecting CRVS to wellbeing, and describe an ecological association between CRVS and health outcomes. The conceptual model posits that the legal identity that civil registration provides to individuals is key to access entitlements and services. Vital statistics produced by CRVS systems provide essential information for public health policy and prevention. These outcomes benefit individuals and societies, including improved health. We use marginal linear models and lag-lead analysis to measure ecological associations between a composite metric of CRVS performance and three health outcomes. Results are consistent with the conceptual model: improved CRVS performance coincides with improved health outcomes worldwide in a temporally consistent manner. Investment to strengthen CRVS systems is not only an important goal for individuals and societies, but also a development imperative that is good for health. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Finding the Root Causes of Statistical Inconsistency in Community Earth System Model Output

    NASA Astrophysics Data System (ADS)

    Milroy, D.; Hammerling, D.; Baker, A. H.

    2017-12-01

    Baker et al (2015) developed the Community Earth System Model Ensemble Consistency Test (CESM-ECT) to provide a metric for software quality assurance by determining statistical consistency between an ensemble of CESM outputs and new test runs. The test has proved useful for detecting statistical difference caused by compiler bugs and errors in physical modules. However, detection is only the necessary first step in finding the causes of statistical difference. The CESM is a vastly complex model comprised of millions of lines of code which is developed and maintained by a large community of software engineers and scientists. Any root cause analysis is correspondingly challenging. We propose a new capability for CESM-ECT: identifying the sections of code that cause statistical distinguishability. The first step is to discover CESM variables that cause CESM-ECT to classify new runs as statistically distinct, which we achieve via Randomized Logistic Regression. Next we use a tool developed to identify CESM components that define or compute the variables found in the first step. Finally, we employ the application Kernel GENerator (KGEN) created in Kim et al (2016) to detect fine-grained floating point differences. We demonstrate an example of the procedure and advance a plan to automate this process in our future work.

  2. Multiplicative point process as a model of trading activity

    NASA Astrophysics Data System (ADS)

    Gontis, V.; Kaulakys, B.

    2004-11-01

    Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.

  3. Inflationary tensor fossils in large-scale structure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dimastrogiovanni, Emanuela; Fasiello, Matteo; Jeong, Donghui

    Inflation models make specific predictions for a tensor-scalar-scalar three-point correlation, or bispectrum, between one gravitational-wave (tensor) mode and two density-perturbation (scalar) modes. This tensor-scalar-scalar correlation leads to a local power quadrupole, an apparent departure from statistical isotropy in our Universe, as well as characteristic four-point correlations in the current mass distribution in the Universe. So far, the predictions for these observables have been worked out only for single-clock models in which certain consistency conditions between the tensor-scalar-scalar correlation and tensor and scalar power spectra are satisfied. Here we review the requirements on inflation models for these consistency conditions to bemore » satisfied. We then consider several examples of inflation models, such as non-attractor and solid-inflation models, in which these conditions are put to the test. In solid inflation the simplest consistency conditions are already violated whilst in the non-attractor model we find that, contrary to the standard scenario, the tensor-scalar-scalar correlator probes directly relevant model-dependent information. We work out the predictions for observables in these models. For non-attractor inflation we find an apparent local quadrupolar departure from statistical isotropy in large-scale structure but that this power quadrupole decreases very rapidly at smaller scales. The consistency of the CMB quadrupole with statistical isotropy then constrains the distance scale that corresponds to the transition from the non-attractor to attractor phase of inflation to be larger than the currently observable horizon. Solid inflation predicts clustering fossils signatures in the current galaxy distribution that may be large enough to be detectable with forthcoming, and possibly even current, galaxy surveys.« less

  4. New approach in the quantum statistical parton distribution

    NASA Astrophysics Data System (ADS)

    Sohaily, Sozha; Vaziri (Khamedi), Mohammad

    2017-12-01

    An attempt to find simple parton distribution functions (PDFs) based on quantum statistical approach is presented. The PDFs described by the statistical model have very interesting physical properties which help to understand the structure of partons. The longitudinal portion of distribution functions are given by applying the maximum entropy principle. An interesting and simple approach to determine the statistical variables exactly without fitting and fixing parameters is surveyed. Analytic expressions of the x-dependent PDFs are obtained in the whole x region [0, 1], and the computed distributions are consistent with the experimental observations. The agreement with experimental data, gives a robust confirm of our simple presented statistical model.

  5. Modeling Soot Oxidation and Gasification with Bayesian Statistics

    DOE PAGES

    Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.; ...

    2017-08-22

    This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less

  6. Modeling Soot Oxidation and Gasification with Bayesian Statistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.

    This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less

  7. Modeling Statistics of Fish Patchiness and Predicting Associated Influence on Statistics of Acoustic Echoes

    DTIC Science & Technology

    2014-09-30

    were compared with 3-D multi-beam data collected by Paramo and Gerlotto. The data were consistent with the Anderson model in that both the data and...column of a random, oceanic waveguide,” J. Acoust. Soc. Am., DOI 10.1121/1.4881925 [published, refereed] Stanton, T.K., Bhatia, S., J. Paramo , and F

  8. Detecting changes in dynamic and complex acoustic environments

    PubMed Central

    Boubenec, Yves; Lawlor, Jennifer; Górska, Urszula; Shamma, Shihab; Englitz, Bernhard

    2017-01-01

    Natural sounds such as wind or rain, are characterized by the statistical occurrence of their constituents. Despite their complexity, listeners readily detect changes in these contexts. We here address the neural basis of statistical decision-making using a combination of psychophysics, EEG and modelling. In a texture-based, change-detection paradigm, human performance and reaction times improved with longer pre-change exposure, consistent with improved estimation of baseline statistics. Change-locked and decision-related EEG responses were found in a centro-parietal scalp location, whose slope depended on change size, consistent with sensory evidence accumulation. The potential's amplitude scaled with the duration of pre-change exposure, suggesting a time-dependent decision threshold. Auditory cortex-related potentials showed no response to the change. A dual timescale, statistical estimation model accounted for subjects' performance. Furthermore, a decision-augmented auditory cortex model accounted for performance and reaction times, suggesting that the primary cortical representation requires little post-processing to enable change-detection in complex acoustic environments. DOI: http://dx.doi.org/10.7554/eLife.24910.001 PMID:28262095

  9. Suspended Draft: Effects on the Composition and Quality of the Military Workforce in the German Armed Forces

    DTIC Science & Technology

    2016-06-01

    14  Table 2.  Summary of Statistics from GGSS Data ........................................ 35  Table 3.  Summary of Statistics from...similar approach are unsurprisingly quite consistent in outcomes within statistical variance. The model is used to estimate the effects of exogenous...of German residents (~82 million), excluding diplomats, foreign military and homeless persons. (German Federal Office of Statistics , 2013, p. 475

  10. Localized Smart-Interpretation

    NASA Astrophysics Data System (ADS)

    Lundh Gulbrandsen, Mats; Mejer Hansen, Thomas; Bach, Torben; Pallesen, Tom

    2014-05-01

    The complex task of setting up a geological model consists not only of combining available geological information into a conceptual plausible model, but also requires consistency with availably data, e.g. geophysical data. However, in many cases the direct geological information, e.g borehole samples, are very sparse, so in order to create a geological model, the geologist needs to rely on the geophysical data. The problem is however, that the amount of geophysical data in many cases are so vast that it is practically impossible to integrate all of them in the manual interpretation process. This means that a lot of the information available from the geophysical surveys are unexploited, which is a problem, due to the fact that the resulting geological model does not fulfill its full potential and hence are less trustworthy. We suggest an approach to geological modeling that 1. allow all geophysical data to be considered when building the geological model 2. is fast 3. allow quantification of geological modeling. The method is constructed to build a statistical model, f(d,m), describing the relation between what the geologists interpret, d, and what the geologist knows, m. The para- meter m reflects any available information that can be quantified, such as geophysical data, the result of a geophysical inversion, elevation maps, etc... The parameter d reflects an actual interpretation, such as for example the depth to the base of a ground water reservoir. First we infer a statistical model f(d,m), by examining sets of actual interpretations made by a geological expert, [d1, d2, ...], and the information used to perform the interpretation; [m1, m2, ...]. This makes it possible to quantify how the geological expert performs interpolation through f(d,m). As the geological expert proceeds interpreting, the number of interpreted datapoints from which the statistical model is inferred increases, and therefore the accuracy of the statistical model increases. When a model f(d,m) successfully has been inferred, we are able to simulate how the geological expert would perform an interpretation given some external information m, through f(d|m). We will demonstrate this method applied on geological interpretation and densely sampled airborne electromagnetic data. In short, our goal is to build a statistical model describing how a geological expert performs geological interpretation given some geophysical data. We then wish to use this statistical model to perform semi automatic interpretation, everywhere where such geophysical data exist, in a manner consistent with the choices made by a geological expert. Benefits of such a statistical model are that 1. it provides a quantification of how a geological expert performs interpretation based on available diverse data 2. all available geophysical information can be used 3. it allows much faster interpretation of large data sets.

  11. From Weakly Chaotic Dynamics to Deterministic Subdiffusion via Copula Modeling

    NASA Astrophysics Data System (ADS)

    Nazé, Pierre

    2018-03-01

    Copula modeling consists in finding a probabilistic distribution, called copula, whereby its coupling with the marginal distributions of a set of random variables produces their joint distribution. The present work aims to use this technique to connect the statistical distributions of weakly chaotic dynamics and deterministic subdiffusion. More precisely, we decompose the jumps distribution of Geisel-Thomae map into a bivariate one and determine the marginal and copula distributions respectively by infinite ergodic theory and statistical inference techniques. We verify therefore that the characteristic tail distribution of subdiffusion is an extreme value copula coupling Mittag-Leffler distributions. We also present a method to calculate the exact copula and joint distributions in the case where weakly chaotic dynamics and deterministic subdiffusion statistical distributions are already known. Numerical simulations and consistency with the dynamical aspects of the map support our results.

  12. Statistical analysis of weigh-in-motion data for bridge design in Vermont.

    DOT National Transportation Integrated Search

    2014-10-01

    This study investigates the suitability of the HL-93 live load model recommended by AASHTO LRFD Specifications : for its use in the analysis and design of bridges in Vermont. The method of approach consists in performing a : statistical analysis of w...

  13. Alpha1 LASSO data bundles Lamont, OK

    DOE Data Explorer

    Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Krishna, Bhargavi (ORCID:000000018828528X)

    2016-08-03

    A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input includes model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.

  14. Quantum theory of multiscale coarse-graining.

    PubMed

    Han, Yining; Jin, Jaehyeok; Wagner, Jacob W; Voth, Gregory A

    2018-03-14

    Coarse-grained (CG) models serve as a powerful tool to simulate molecular systems at much longer temporal and spatial scales. Previously, CG models and methods have been built upon classical statistical mechanics. The present paper develops a theory and numerical methodology for coarse-graining in quantum statistical mechanics, by generalizing the multiscale coarse-graining (MS-CG) method to quantum Boltzmann statistics. A rigorous derivation of the sufficient thermodynamic consistency condition is first presented via imaginary time Feynman path integrals. It identifies the optimal choice of CG action functional and effective quantum CG (qCG) force field to generate a quantum MS-CG (qMS-CG) description of the equilibrium system that is consistent with the quantum fine-grained model projected onto the CG variables. A variational principle then provides a class of algorithms for optimally approximating the qMS-CG force fields. Specifically, a variational method based on force matching, which was also adopted in the classical MS-CG theory, is generalized to quantum Boltzmann statistics. The qMS-CG numerical algorithms and practical issues in implementing this variational minimization procedure are also discussed. Then, two numerical examples are presented to demonstrate the method. Finally, as an alternative strategy, a quasi-classical approximation for the thermal density matrix expressed in the CG variables is derived. This approach provides an interesting physical picture for coarse-graining in quantum Boltzmann statistical mechanics in which the consistency with the quantum particle delocalization is obviously manifest, and it opens up an avenue for using path integral centroid-based effective classical force fields in a coarse-graining methodology.

  15. A complete sample of double-lobed radio quasars for VLBI tests of source models - Definition and statistics

    NASA Technical Reports Server (NTRS)

    Hough, D. H.; Readhead, A. C. S.

    1989-01-01

    A complete, flux-density-limited sample of double-lobed radio quasars is defined, with nuclei bright enough to be mapped with the Mark III VLBI system. It is shown that the statistics of linear size, nuclear strength, and curvature are consistent with the assumption of random source orientations and simple relativistic beaming in the nuclei. However, these statistics are also consistent with the effects of interaction between the beams and the surrounding medium. The distribution of jet velocities in the nuclei, as measured with VLBI, will provide a powerful test of physical theories of extragalactic radio sources.

  16. Analyzing the Statistical Reasoning Levels of Pre-Service Elementary School Teachers in the Context of a Model Eliciting Activity

    ERIC Educational Resources Information Center

    Alkas Ulusoy, Cigdem; Kayhan Altay, Mesture

    2017-01-01

    The purpose of this study is to analyze the statistical reasoning levels of preservice elementary school teachers. With this purpose, pre-service teachers consisting of 29 groups worked on a model eliciting activity (MEA) in scope of an elective course they were taking. At the end of the class, they were asked to present their solutions while…

  17. A comparison of large-scale climate signals and the North American Multi-Model Ensemble (NMME) for drought prediction in China

    NASA Astrophysics Data System (ADS)

    Xu, Lei; Chen, Nengcheng; Zhang, Xiang

    2018-02-01

    Drought is an extreme natural disaster that can lead to huge socioeconomic losses. Drought prediction ahead of months is helpful for early drought warning and preparations. In this study, we developed a statistical model, two weighted dynamic models and a statistical-dynamic (hybrid) model for 1-6 month lead drought prediction in China. Specifically, statistical component refers to climate signals weighting by support vector regression (SVR), dynamic components consist of the ensemble mean (EM) and Bayesian model averaging (BMA) of the North American Multi-Model Ensemble (NMME) climatic models, and the hybrid part denotes a combination of statistical and dynamic components by assigning weights based on their historical performances. The results indicate that the statistical and hybrid models show better rainfall predictions than NMME-EM and NMME-BMA models, which have good predictability only in southern China. In the 2011 China winter-spring drought event, the statistical model well predicted the spatial extent and severity of drought nationwide, although the severity was underestimated in the mid-lower reaches of Yangtze River (MLRYR) region. The NMME-EM and NMME-BMA models largely overestimated rainfall in northern and western China in 2011 drought. In the 2013 China summer drought, the NMME-EM model forecasted the drought extent and severity in eastern China well, while the statistical and hybrid models falsely detected negative precipitation anomaly (NPA) in some areas. Model ensembles such as multiple statistical approaches, multiple dynamic models or multiple hybrid models for drought predictions were highlighted. These conclusions may be helpful for drought prediction and early drought warnings in China.

  18. Statistical, economic and other tools for assessing natural aggregate

    USGS Publications Warehouse

    Bliss, J.D.; Moyle, P.R.; Bolm, K.S.

    2003-01-01

    Quantitative aggregate resource assessment provides resource estimates useful for explorationists, land managers and those who make decisions about land allocation, which may have long-term implications concerning cost and the availability of aggregate resources. Aggregate assessment needs to be systematic and consistent, yet flexible enough to allow updating without invalidating other parts of the assessment. Evaluators need to use standard or consistent aggregate classification and statistic distributions or, in other words, models with geological, geotechnical and economic variables or interrelationships between these variables. These models can be used with subjective estimates, if needed, to estimate how much aggregate may be present in a region or country using distributions generated by Monte Carlo computer simulations.

  19. LIDT-DD: A New Self-Consistent Debris Disc Model Including Radiation Pressure and Coupling Dynamical and Collisional Evolution

    NASA Astrophysics Data System (ADS)

    Kral, Q.; Thebault, P.; Charnoz, S.

    2014-01-01

    The first attempt at developing a fully self-consistent code coupling dynamics and collisions to study debris discs (Kral et al. 2013) is presented. So far, these two crucial mechanisms were studied separately, with N-body and statistical collisional codes respectively, because of stringent computational constraints. We present a new model named LIDT-DD which is able to follow over long timescales the coupled evolution of dynamics (including radiation forces) and collisions in a self-consistent way.

  20. Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08

    PubMed Central

    Casey, Martin F.; Neidell, Matthew

    2013-01-01

    Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205

  1. A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis

    ERIC Educational Resources Information Center

    Gonzalez, Oscar; MacKinnon, David P.

    2018-01-01

    Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to…

  2. Self-consistent assessment of Englert-Schwinger model on atomic properties

    NASA Astrophysics Data System (ADS)

    Lehtomäki, Jouko; Lopez-Acevedo, Olga

    2017-12-01

    Our manuscript investigates a self-consistent solution of the statistical atom model proposed by Berthold-Georg Englert and Julian Schwinger (the ES model) and benchmarks it against atomic Kohn-Sham and two orbital-free models of the Thomas-Fermi-Dirac (TFD)-λvW family. Results show that the ES model generally offers the same accuracy as the well-known TFD-1/5 vW model; however, the ES model corrects the failure in the Pauli potential near-nucleus region. We also point to the inability of describing low-Z atoms as the foremost concern in improving the present model.

  3. Self-consistent assessment of Englert-Schwinger model on atomic properties.

    PubMed

    Lehtomäki, Jouko; Lopez-Acevedo, Olga

    2017-12-21

    Our manuscript investigates a self-consistent solution of the statistical atom model proposed by Berthold-Georg Englert and Julian Schwinger (the ES model) and benchmarks it against atomic Kohn-Sham and two orbital-free models of the Thomas-Fermi-Dirac (TFD)-λvW family. Results show that the ES model generally offers the same accuracy as the well-known TFD-15vW model; however, the ES model corrects the failure in the Pauli potential near-nucleus region. We also point to the inability of describing low-Z atoms as the foremost concern in improving the present model.

  4. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    USGS Publications Warehouse

    Furey, P.R.; Troutman, B.M.

    2008-01-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.

  5. Non-parametric model selection for subject-specific topological organization of resting-state functional connectivity.

    PubMed

    Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J

    2011-06-01

    In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.

  6. Infinitely divisible cascades to model the statistics of natural images.

    PubMed

    Chainais, Pierre

    2007-12-01

    We propose to model the statistics of natural images thanks to the large class of stochastic processes called Infinitely Divisible Cascades (IDC). IDC were first introduced in one dimension to provide multifractal time series to model the so-called intermittency phenomenon in hydrodynamical turbulence. We have extended the definition of scalar infinitely divisible cascades from 1 to N dimensions and commented on the relevance of such a model in fully developed turbulence in [1]. In this article, we focus on the particular 2 dimensional case. IDC appear as good candidates to model the statistics of natural images. They share most of their usual properties and appear to be consistent with several independent theoretical and experimental approaches of the literature. We point out the interest of IDC for applications to procedural texture synthesis.

  7. Assessing the Accuracy and Consistency of Language Proficiency Classification under Competing Measurement Models

    ERIC Educational Resources Information Center

    Zhang, Bo

    2010-01-01

    This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…

  8. Inference of reaction rate parameters based on summary statistics from experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  9. Inference of reaction rate parameters based on summary statistics from experiments

    DOE PAGES

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin; ...

    2016-10-15

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kogalovskii, M.R.

    This paper presents a review of problems related to statistical database systems, which are wide-spread in various fields of activity. Statistical databases (SDB) are referred to as databases that consist of data and are used for statistical analysis. Topics under consideration are: SDB peculiarities, properties of data models adequate for SDB requirements, metadata functions, null-value problems, SDB compromise protection problems, stored data compression techniques, and statistical data representation means. Also examined is whether the present Database Management Systems (DBMS) satisfy the SDB requirements. Some actual research directions in SDB systems are considered.

  11. Energy-density field approach for low- and medium-frequency vibroacoustic analysis of complex structures using a statistical computational model

    NASA Astrophysics Data System (ADS)

    Kassem, M.; Soize, C.; Gagliardini, L.

    2009-06-01

    In this paper, an energy-density field approach applied to the vibroacoustic analysis of complex industrial structures in the low- and medium-frequency ranges is presented. This approach uses a statistical computational model. The analyzed system consists of an automotive vehicle structure coupled with its internal acoustic cavity. The objective of this paper is to make use of the statistical properties of the frequency response functions of the vibroacoustic system observed from previous experimental and numerical work. The frequency response functions are expressed in terms of a dimensionless matrix which is estimated using the proposed energy approach. Using this dimensionless matrix, a simplified vibroacoustic model is proposed.

  12. An astronomer's guide to period searching

    NASA Astrophysics Data System (ADS)

    Schwarzenberg-Czerny, A.

    2003-03-01

    We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.

  13. Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies

    ERIC Educational Resources Information Center

    Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre

    2018-01-01

    Purpose: Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. Method: We propose a…

  14. Non-convex Statistical Optimization for Sparse Tensor Graphical Model

    PubMed Central

    Sun, Wei; Wang, Zhaoran; Liu, Han; Cheng, Guang

    2016-01-01

    We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work. Our theoretical results are backed by thorough numerical studies. PMID:28316459

  15. REVIEW OF THE ATTRIBUTES AND PERFORMANCE OF SIX URBAN DIFFUSION MODELS

    EPA Science Inventory

    The American Meteorological Society conducted a scientific review of a set of six urban diffusion models. TRC Environmental Consultants, Inc. calculated and tabulated a uniform set of statistics for all the models. The report consists of a summary and copies of the three independ...

  16. Statistical prescission point model of fission fragment angular distributions

    NASA Astrophysics Data System (ADS)

    John, Bency; Kataria, S. K.

    1998-03-01

    In light of recent developments in fission studies such as slow saddle to scission motion and spin equilibration near the scission point, the theory of fission fragment angular distribution is examined and a new statistical prescission point model is developed. The conditional equilibrium of the collective angular bearing modes at the prescission point, which is guided mainly by their relaxation times and population probabilities, is taken into account in the present model. The present model gives a consistent description of the fragment angular and spin distributions for a wide variety of heavy and light ion induced fission reactions.

  17. Philosophy and the practice of Bayesian statistics

    PubMed Central

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2015-01-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575

  18. Philosophy and the practice of Bayesian statistics.

    PubMed

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2013-02-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.

  19. “Plateau”-related summary statistics are uninformative for comparing working memory models

    PubMed Central

    van den Berg, Ronald; Ma, Wei Ji

    2014-01-01

    Performance on visual working memory tasks decreases as more items need to be remembered. Over the past decade, a debate has unfolded between proponents of slot models and slotless models of this phenomenon. Zhang and Luck (2008) and Anderson, Vogel, and Awh (2011) noticed that as more items need to be remembered, “memory noise” seems to first increase and then reach a “stable plateau.” They argued that three summary statistics characterizing this plateau are consistent with slot models, but not with slotless models. Here, we assess the validity of their methods. We generated synthetic data both from a leading slot model and from a recent slotless model and quantified model evidence using log Bayes factors. We found that the summary statistics provided, at most, 0.15% of the expected model evidence in the raw data. In a model recovery analysis, a total of more than a million trials were required to achieve 99% correct recovery when models were compared on the basis of summary statistics, whereas fewer than 1,000 trials were sufficient when raw data were used. At realistic numbers of trials, plateau-related summary statistics are completely unreliable for model comparison. Applying the same analyses to subject data from Anderson et al. (2011), we found that the evidence in the summary statistics was, at most, 0.12% of the evidence in the raw data and far too weak to warrant any conclusions. These findings call into question claims about working memory that are based on summary statistics. PMID:24719235

  20. Self-organization, the cascade model, and natural hazards.

    PubMed

    Turcotte, Donald L; Malamud, Bruce D; Guzzetti, Fausto; Reichenbach, Paola

    2002-02-19

    We consider the frequency-size statistics of two natural hazards, forest fires and landslides. Both appear to satisfy power-law (fractal) distributions to a good approximation under a wide variety of conditions. Two simple cellular-automata models have been proposed as analogs for this observed behavior, the forest fire model for forest fires and the sand pile model for landslides. The behavior of these models can be understood in terms of a self-similar inverse cascade. For the forest fire model the cascade consists of the coalescence of clusters of trees; for the sand pile model the cascade consists of the coalescence of metastable regions.

  1. Self-organization, the cascade model, and natural hazards

    PubMed Central

    Turcotte, Donald L.; Malamud, Bruce D.; Guzzetti, Fausto; Reichenbach, Paola

    2002-01-01

    We consider the frequency-size statistics of two natural hazards, forest fires and landslides. Both appear to satisfy power-law (fractal) distributions to a good approximation under a wide variety of conditions. Two simple cellular-automata models have been proposed as analogs for this observed behavior, the forest fire model for forest fires and the sand pile model for landslides. The behavior of these models can be understood in terms of a self-similar inverse cascade. For the forest fire model the cascade consists of the coalescence of clusters of trees; for the sand pile model the cascade consists of the coalescence of metastable regions. PMID:11875206

  2. Single, Complete, Probability Spaces Consistent With EPR-Bohm-Bell Experimental Data

    NASA Astrophysics Data System (ADS)

    Avis, David; Fischer, Paul; Hilbert, Astrid; Khrennikov, Andrei

    2009-03-01

    We show that paradoxical consequences of violations of Bell's inequality are induced by the use of an unsuitable probabilistic description for the EPR-Bohm-Bell experiment. The conventional description (due to Bell) is based on a combination of statistical data collected for different settings of polarization beam splitters (PBSs). In fact, such data consists of some conditional probabilities which only partially define a probability space. Ignoring this conditioning leads to apparent contradictions in the classical probabilistic model (due to Kolmogorov). We show how to make a completely consistent probabilistic model by taking into account the probabilities of selecting the settings of the PBSs. Our model matches both the experimental data and is consistent with classical probability theory.

  3. An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

    ERIC Educational Resources Information Center

    Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.

    2013-01-01

    Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…

  4. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media.

    PubMed

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-06-21

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability-inherent to their nanoporosity-are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery.

  5. Performance Analysis of Live-Virtual-Constructive and Distributed Virtual Simulations: Defining Requirements in Terms of Temporal Consistency

    DTIC Science & Technology

    2009-12-01

    events. Work associated with aperiodic tasks have the same statistical behavior and the same timing requirements. The timing deadlines are soft. • Sporadic...answers, but it is possible to calculate how precise the estimates are. Simulation-based performance analysis of a model includes a statistical ...to evaluate all pos- sible states in a timely manner. This is the principle reason for resorting to simulation and statistical analysis to evaluate

  6. The Equivalence of Regression Models Using Difference Scores and Models Using Separate Scores for Each Informant: Implications for the Study of Informant Discrepancies

    ERIC Educational Resources Information Center

    Laird, Robert D.; Weems, Carl F.

    2011-01-01

    Research on informant discrepancies has increasingly utilized difference scores. This article demonstrates the statistical equivalence of regression models using difference scores (raw or standardized) and regression models using separate scores for each informant to show that interpretations should be consistent with both models. First,…

  7. A Multiple Group Measurement Model of Children's Reports of Parental Socioeconomic Status. Discussion Papers No. 531-78.

    ERIC Educational Resources Information Center

    Mare, Robert D.; Mason, William M.

    An important class of applications of measurement error or constrained factor analytic models consists of comparing models for several populations. In such cases, it is appropriate to make explicit statistical tests of model similarity across groups and to constrain some parameters of the models to be equal across groups using a priori substantive…

  8. Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance.

    PubMed

    Brase, Gary L; Vasserman, Eugene Y; Hsu, William

    2017-01-01

    Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings.

  9. Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance

    PubMed Central

    Brase, Gary L.; Vasserman, Eugene Y.; Hsu, William

    2017-01-01

    Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings. PMID:29163304

  10. The impact of alcohol taxation on liver cirrhosis mortality.

    PubMed

    Ponicki, William R; Gruenewald, Paul J

    2006-11-01

    The objective of this study is to investigate the impact of distilled spirits, wine, and beer taxes on cirrhosis mortality using a large-panel data set and statistical models that control for various other factors that may affect that mortality. The analyses were performed on a panel of 30 U.S. license states during the period 1971-1998 (N = 840 state-by-year observations). Exogenous measures included current and lagged versions of beverage taxes and income, as well as controls for states' age distribution, religion, race, health care availability, urbanity, tourism, and local bans on alcohol sales. Regression analyses were performed using random-effects models with corrections for serial autocorrelation and heteroscedasticity among states. Cirrhosis rates were found to be significantly related to taxes on distilled spirits but not to taxation of wine and beer. Consistent results were found using different statistical models and model specifications. Consistent with prior research, cirrhosis mortality in the United States appears more closely linked to consumption of distilled spirits than to that of other alcoholic beverages.

  11. "Plateau"-related summary statistics are uninformative for comparing working memory models.

    PubMed

    van den Berg, Ronald; Ma, Wei Ji

    2014-10-01

    Performance on visual working memory tasks decreases as more items need to be remembered. Over the past decade, a debate has unfolded between proponents of slot models and slotless models of this phenomenon (Ma, Husain, Bays (Nature Neuroscience 17, 347-356, 2014). Zhang and Luck (Nature 453, (7192), 233-235, 2008) and Anderson, Vogel, and Awh (Attention, Perception, Psychophys 74, (5), 891-910, 2011) noticed that as more items need to be remembered, "memory noise" seems to first increase and then reach a "stable plateau." They argued that three summary statistics characterizing this plateau are consistent with slot models, but not with slotless models. Here, we assess the validity of their methods. We generated synthetic data both from a leading slot model and from a recent slotless model and quantified model evidence using log Bayes factors. We found that the summary statistics provided at most 0.15 % of the expected model evidence in the raw data. In a model recovery analysis, a total of more than a million trials were required to achieve 99 % correct recovery when models were compared on the basis of summary statistics, whereas fewer than 1,000 trials were sufficient when raw data were used. Therefore, at realistic numbers of trials, plateau-related summary statistics are highly unreliable for model comparison. Applying the same analyses to subject data from Anderson et al. (Attention, Perception, Psychophys 74, (5), 891-910, 2011), we found that the evidence in the summary statistics was at most 0.12 % of the evidence in the raw data and far too weak to warrant any conclusions. The evidence in the raw data, in fact, strongly favored the slotless model. These findings call into question claims about working memory that are based on summary statistics.

  12. Stochastic modeling of Lagrangian accelerations

    NASA Astrophysics Data System (ADS)

    Reynolds, Andy

    2002-11-01

    It is shown how Sawford's second-order Lagrangian stochastic model (Phys. Fluids A 3, 1577-1586, 1991) for fluid-particle accelerations can be combined with a model for the evolution of the dissipation rate (Pope and Chen, Phys. Fluids A 2, 1437-1449, 1990) to produce a Lagrangian stochastic model that is consistent with both the measured distribution of Lagrangian accelerations (La Porta et al., Nature 409, 1017-1019, 2001) and Kolmogorov's similarity theory. The later condition is found not to be satisfied when a constant dissipation rate is employed and consistency with prescribed acceleration statistics is enforced through fulfilment of a well-mixed condition.

  13. 'Chain pooling' model selection as developed for the statistical analysis of a rotor burst protection experiment

    NASA Technical Reports Server (NTRS)

    Holms, A. G.

    1977-01-01

    A statistical decision procedure called chain pooling had been developed for model selection in fitting the results of a two-level fixed-effects full or fractional factorial experiment not having replication. The basic strategy included the use of one nominal level of significance for a preliminary test and a second nominal level of significance for the final test. The subject has been reexamined from the point of view of using as many as three successive statistical model deletion procedures in fitting the results of a single experiment. The investigation consisted of random number studies intended to simulate the results of a proposed aircraft turbine-engine rotor-burst-protection experiment. As a conservative approach, population model coefficients were chosen to represent a saturated 2 to the 4th power experiment with a distribution of parameter values unfavorable to the decision procedures. Three model selection strategies were developed.

  14. Artificial neural network study on organ-targeting peptides

    NASA Astrophysics Data System (ADS)

    Jung, Eunkyoung; Kim, Junhyoung; Choi, Seung-Hoon; Kim, Minkyoung; Rhee, Hokyoung; Shin, Jae-Min; Choi, Kihang; Kang, Sang-Kee; Lee, Nam Kyung; Choi, Yun-Jaie; Jung, Dong Hyun

    2010-01-01

    We report a new approach to studying organ targeting of peptides on the basis of peptide sequence information. The positive control data sets consist of organ-targeting peptide sequences identified by the peroral phage-display technique for four organs, and the negative control data are prepared from random sequences. The capacity of our models to make appropriate predictions is validated by statistical indicators including sensitivity, specificity, enrichment curve, and the area under the receiver operating characteristic (ROC) curve (the ROC score). VHSE descriptor produces statistically significant training models and the models with simple neural network architectures show slightly greater predictive power than those with complex ones. The training and test set statistics indicate that our models could discriminate between organ-targeting and random sequences. We anticipate that our models will be applicable to the selection of organ-targeting peptides for generating peptide drugs or peptidomimetics.

  15. Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

    NASA Astrophysics Data System (ADS)

    Hoffman, A.; Forest, C. E.; Kemanian, A.

    2016-12-01

    A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.

  16. Statistical label fusion with hierarchical performance models

    PubMed Central

    Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.

    2014-01-01

    Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally – fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy. PMID:24817809

  17. CONSISTENCY UNDER SAMPLING OF EXPONENTIAL RANDOM GRAPH MODELS.

    PubMed

    Shalizi, Cosma Rohilla; Rinaldo, Alessandro

    2013-04-01

    The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling , or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM's expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses.

  18. CONSISTENCY UNDER SAMPLING OF EXPONENTIAL RANDOM GRAPH MODELS

    PubMed Central

    Shalizi, Cosma Rohilla; Rinaldo, Alessandro

    2015-01-01

    The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling, or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM’s expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses. PMID:26166910

  19. Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits.

    PubMed

    Bernhardt, Paul W; Wang, Huixia J; Zhang, Daowen

    2015-05-01

    Censored observations are a common occurrence in biomedical data sets. Although a large amount of research has been devoted to estimation and inference for data with censored responses, very little research has focused on proper statistical procedures when predictors are censored. In this paper, we consider statistical methods for dealing with multiple predictors subject to detection limits within the context of generalized linear models. We investigate and adapt several conventional methods and develop a new multiple imputation approach for analyzing data sets with predictors censored due to detection limits. We establish the consistency and asymptotic normality of the proposed multiple imputation estimator and suggest a computationally simple and consistent variance estimator. We also demonstrate that the conditional mean imputation method often leads to inconsistent estimates in generalized linear models, while several other methods are either computationally intensive or lead to parameter estimates that are biased or more variable compared to the proposed multiple imputation estimator. In an extensive simulation study, we assess the bias and variability of different approaches within the context of a logistic regression model and compare variance estimation methods for the proposed multiple imputation estimator. Lastly, we apply several methods to analyze the data set from a recently-conducted GenIMS study.

  20. Safety performance functions.

    DOT National Transportation Integrated Search

    2014-10-01

    This project developed safety performance functions for roadway segments and intersections for two-lane rural highways in : Pennsylvania. The statistical modeling methodology was consistent with that used in the first edition of the American : Associ...

  1. Football goal distributions and extremal statistics

    NASA Astrophysics Data System (ADS)

    Greenhough, J.; Birch, P. C.; Chapman, S. C.; Rowlands, G.

    2002-12-01

    We analyse the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001. The probability density functions (PDFs) of goals scored are too heavy-tailed to be fitted over their entire ranges by Poisson or negative binomial distributions which would be expected for uncorrelated processes. Log-normal distributions cannot include zero scores and here we find that the PDFs are consistent with those arising from extremal statistics. In addition, we show that it is sufficient to model English top division and FA Cup matches in the seasons of 1970/71-2000/01 on Poisson or negative binomial distributions, as reported in analyses of earlier seasons, and that these are not consistent with extremal statistics.

  2. [Statistical validity of the Mexican Food Security Scale and the Latin American and Caribbean Food Security Scale].

    PubMed

    Villagómez-Ornelas, Paloma; Hernández-López, Pedro; Carrasco-Enríquez, Brenda; Barrios-Sánchez, Karina; Pérez-Escamilla, Rafael; Melgar-Quiñónez, Hugo

    2014-01-01

    This article validates the statistical consistency of two food security scales: the Mexican Food Security Scale (EMSA) and the Latin American and Caribbean Food Security Scale (ELCSA). Validity tests were conducted in order to verify that both scales were consistent instruments, conformed by independent, properly calibrated and adequately sorted items, arranged in a continuum of severity. The following tests were developed: sorting of items; Cronbach's alpha analysis; parallelism of prevalence curves; Rasch models; sensitivity analysis through mean differences' hypothesis test. The tests showed that both scales meet the required attributes and are robust statistical instruments for food security measurement. This is relevant given that the lack of access to food indicator, included in multidimensional poverty measurement in Mexico, is calculated with EMSA.

  3. Virtual Model Validation of Complex Multiscale Systems: Applications to Nonlinear Elastostatics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oden, John Tinsley; Prudencio, Ernest E.; Bauman, Paul T.

    We propose a virtual statistical validation process as an aid to the design of experiments for the validation of phenomenological models of the behavior of material bodies, with focus on those cases in which knowledge of the fabrication process used to manufacture the body can provide information on the micro-molecular-scale properties underlying macroscale behavior. One example is given by models of elastomeric solids fabricated using polymerization processes. We describe a framework for model validation that involves Bayesian updates of parameters in statistical calibration and validation phases. The process enables the quanti cation of uncertainty in quantities of interest (QoIs) andmore » the determination of model consistency using tools of statistical information theory. We assert that microscale information drawn from molecular models of the fabrication of the body provides a valuable source of prior information on parameters as well as a means for estimating model bias and designing virtual validation experiments to provide information gain over calibration posteriors.« less

  4. BIG BANG NUCLEOSYNTHESIS WITH A NON-MAXWELLIAN DISTRIBUTION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bertulani, C. A.; Fuqua, J.; Hussein, M. S.

    The abundances of light elements based on the big bang nucleosynthesis model are calculated using the Tsallis non-extensive statistics. The impact of the variation of the non-extensive parameter q from the unity value is compared to observations and to the abundance yields from the standard big bang model. We find large differences between the reaction rates and the abundance of light elements calculated with the extensive and the non-extensive statistics. We found that the observations are consistent with a non-extensive parameter q = 1{sub -} {sub 0.12}{sup +0.05}, indicating that a large deviation from the Boltzmann-Gibbs statistics (q = 1)more » is highly unlikely.« less

  5. Impact of a statistical bias correction on the projected simulated hydrological changes obtained from three GCMs and two hydrology models

    NASA Astrophysics Data System (ADS)

    Hagemann, Stefan; Chen, Cui; Haerter, Jan O.; Gerten, Dieter; Heinke, Jens; Piani, Claudio

    2010-05-01

    Future climate model scenarios depend crucially on their adequate representation of the hydrological cycle. Within the European project "Water and Global Change" (WATCH) special care is taken to couple state-of-the-art climate model output to a suite of hydrological models. This coupling is expected to lead to a better assessment of changes in the hydrological cycle. However, due to the systematic model errors of climate models, their output is often not directly applicable as input for hydrological models. Thus, the methodology of a statistical bias correction has been developed, which can be used for correcting climate model output to produce internally consistent fields that have the same statistical intensity distribution as the observations. As observations, global re-analysed daily data of precipitation and temperature are used that are obtained in the WATCH project. We will apply the bias correction to global climate model data of precipitation and temperature from the GCMs ECHAM5/MPIOM, CNRM-CM3 and LMDZ-4, and intercompare the bias corrected data to the original GCM data and the observations. Then, the orginal and the bias corrected GCM data will be used to force two global hydrology models: (1) the hydrological model of the Max Planck Institute for Meteorology (MPI-HM) consisting of the Simplified Land surface (SL) scheme and the Hydrological Discharge (HD) model, and (2) the dynamic vegetation model LPJmL operated by the Potsdam Institute for Climate Impact Research. The impact of the bias correction on the projected simulated hydrological changes will be analysed, and the resulting behaviour of the two hydrology models will be compared.

  6. Consistent Partial Least Squares Path Modeling via Regularization.

    PubMed

    Jung, Sunho; Park, JaeHong

    2018-01-01

    Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present.

  7. The Analysis of Organizational Diagnosis on Based Six Box Model in Universities

    ERIC Educational Resources Information Center

    Hamid, Rahimi; Siadat, Sayyed Ali; Reza, Hoveida; Arash, Shahin; Ali, Nasrabadi Hasan; Azizollah, Arbabisarjou

    2011-01-01

    Purpose: The analysis of organizational diagnosis on based six box model at universities. Research method: Research method was descriptive-survey. Statistical population consisted of 1544 faculty members of universities which through random strafed sampling method 218 persons were chosen as the sample. Research Instrument were organizational…

  8. Comparison of Artificial Neural Networks and ARIMA statistical models in simulations of target wind time series

    NASA Astrophysics Data System (ADS)

    Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas

    2015-04-01

    The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.

  9. Efficient detection of wound-bed and peripheral skin with statistical colour models.

    PubMed

    Veredas, Francisco J; Mesa, Héctor; Morente, Laura

    2015-04-01

    A pressure ulcer is a clinical pathology of localised damage to the skin and underlying tissue caused by pressure, shear or friction. Reliable diagnosis supported by precise wound evaluation is crucial in order to success on treatment decisions. This paper presents a computer-vision approach to wound-area detection based on statistical colour models. Starting with a training set consisting of 113 real wound images, colour histogram models are created for four different tissue types. Back-projections of colour pixels on those histogram models are used, from a Bayesian perspective, to get an estimate of the posterior probability of a pixel to belong to any of those tissue classes. Performance measures obtained from contingency tables based on a gold standard of segmented images supplied by experts have been used for model selection. The resulting fitted model has been validated on a training set consisting of 322 wound images manually segmented and labelled by expert clinicians. The final fitted segmentation model shows robustness and gives high mean performance rates [(AUC: .9426 (SD .0563); accuracy: .8777 (SD .0799); F-score: 0.7389 (SD .1550); Cohen's kappa: .6585 (SD .1787)] when segmenting significant wound areas that include healing tissues.

  10. Financial Stylized Facts in the Word of Mouth Model

    NASA Astrophysics Data System (ADS)

    Misawa, Tadanobu; Watanabe, Kyoko; Shimokawa, Tetsuya

    Recently, we proposed an agent-based model called the word of mouth model to analyze the influence of an information transmission process to price formation in financial markets. Especially, the short-term predictability of asset return was focused on and an explanation in the view of information transmission was provided to the question why the predictability was much clearly observed in the small-sized stocks. This paper, to extend the previous study, demonstrates that the word of mouth model also has a consistency with other important financial stylized facts. This strengthens the possibility that the information transmission among investors plays a crucial role in price formation. Concretely, this paper addresses two famous statistical features of returns; the leptokurtic distribution of return and the autocorrelation of return volatility. The reasons why these statistical facts receive especial attentions of researchers among financial stylized facts are their statistical robustness and practical importance, such as the applications to the derivative pricing problems.

  11. Statistical thermodynamics of a two-dimensional relativistic gas.

    PubMed

    Montakhab, Afshin; Ghodrat, Malihe; Barati, Mahmood

    2009-03-01

    In this paper we study a fully relativistic model of a two-dimensional hard-disk gas. This model avoids the general problems associated with relativistic particle collisions and is therefore an ideal system to study relativistic effects in statistical thermodynamics. We study this model using molecular-dynamics simulation, concentrating on the velocity distribution functions. We obtain results for x and y components of velocity in the rest frame (Gamma) as well as the moving frame (Gamma;{'}) . Our results confirm that Jüttner distribution is the correct generalization of Maxwell-Boltzmann distribution. We obtain the same "temperature" parameter beta for both frames consistent with a recent study of a limited one-dimensional model. We also address the controversial topic of temperature transformation. We show that while local thermal equilibrium holds in the moving frame, relying on statistical methods such as distribution functions or equipartition theorem are ultimately inconclusive in deciding on a correct temperature transformation law (if any).

  12. Validation of a Statistical Methodology for Extracting Vegetation Feedbacks: Focus on North African Ecosystems in the Community Earth System Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Yan; Notaro, Michael; Wang, Fuyao

    Generalized equilibrium feedback assessment (GEFA) is a potentially valuable multivariate statistical tool for extracting vegetation feedbacks to the atmosphere in either observations or coupled Earth system models. The reliability of GEFA at capturing the terrestrial impacts on regional climate is demonstrated in this paper using the National Center for Atmospheric Research Community Earth System Model (CESM), with focus on North Africa. The feedback is assessed statistically by applying GEFA to output from a fully coupled control run. To reduce the sampling error caused by short data records, the traditional or full GEFA is refined through stepwise GEFA by dropping unimportantmore » forcings. Two ensembles of dynamical experiments are developed for the Sahel or West African monsoon region against which GEFA-based vegetation feedbacks are evaluated. In these dynamical experiments, regional leaf area index (LAI) is modified either alone or in conjunction with soil moisture, with the latter runs motivated by strong regional soil moisture–LAI coupling. Stepwise GEFA boasts higher consistency between statistically and dynamically assessed atmospheric responses to land surface anomalies than full GEFA, especially with short data records. GEFA-based atmospheric responses are more consistent with the coupled soil moisture–LAI experiments, indicating that GEFA is assessing the combined impacts of coupled vegetation and soil moisture. Finally, both the statistical and dynamical assessments reveal a negative vegetation–rainfall feedback in the Sahel associated with an atmospheric stability mechanism in CESM versus a weaker positive feedback in the West African monsoon region associated with a moisture recycling mechanism in CESM.« less

  13. Validation of a Statistical Methodology for Extracting Vegetation Feedbacks: Focus on North African Ecosystems in the Community Earth System Model

    DOE PAGES

    Yu, Yan; Notaro, Michael; Wang, Fuyao; ...

    2018-02-05

    Generalized equilibrium feedback assessment (GEFA) is a potentially valuable multivariate statistical tool for extracting vegetation feedbacks to the atmosphere in either observations or coupled Earth system models. The reliability of GEFA at capturing the terrestrial impacts on regional climate is demonstrated in this paper using the National Center for Atmospheric Research Community Earth System Model (CESM), with focus on North Africa. The feedback is assessed statistically by applying GEFA to output from a fully coupled control run. To reduce the sampling error caused by short data records, the traditional or full GEFA is refined through stepwise GEFA by dropping unimportantmore » forcings. Two ensembles of dynamical experiments are developed for the Sahel or West African monsoon region against which GEFA-based vegetation feedbacks are evaluated. In these dynamical experiments, regional leaf area index (LAI) is modified either alone or in conjunction with soil moisture, with the latter runs motivated by strong regional soil moisture–LAI coupling. Stepwise GEFA boasts higher consistency between statistically and dynamically assessed atmospheric responses to land surface anomalies than full GEFA, especially with short data records. GEFA-based atmospheric responses are more consistent with the coupled soil moisture–LAI experiments, indicating that GEFA is assessing the combined impacts of coupled vegetation and soil moisture. Finally, both the statistical and dynamical assessments reveal a negative vegetation–rainfall feedback in the Sahel associated with an atmospheric stability mechanism in CESM versus a weaker positive feedback in the West African monsoon region associated with a moisture recycling mechanism in CESM.« less

  14. Radiation detection method and system using the sequential probability ratio test

    DOEpatents

    Nelson, Karl E [Livermore, CA; Valentine, John D [Redwood City, CA; Beauchamp, Brock R [San Ramon, CA

    2007-07-17

    A method and system using the Sequential Probability Ratio Test to enhance the detection of an elevated level of radiation, by determining whether a set of observations are consistent with a specified model within a given bounds of statistical significance. In particular, the SPRT is used in the present invention to maximize the range of detection, by providing processing mechanisms for estimating the dynamic background radiation, adjusting the models to reflect the amount of background knowledge at the current point in time, analyzing the current sample using the models to determine statistical significance, and determining when the sample has returned to the expected background conditions.

  15. Model Performance Evaluation and Scenario Analysis ...

    EPA Pesticide Factsheets

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too

  16. Statistical Model of Dynamic Markers of the Alzheimer's Pathological Cascade.

    PubMed

    Balsis, Steve; Geraci, Lisa; Benge, Jared; Lowe, Deborah A; Choudhury, Tabina K; Tirso, Robert; Doody, Rachelle S

    2018-05-05

    Alzheimer's disease (AD) is a progressive disease reflected in markers across assessment modalities, including neuroimaging, cognitive testing, and evaluation of adaptive function. Identifying a single continuum of decline across assessment modalities in a single sample is statistically challenging because of the multivariate nature of the data. To address this challenge, we implemented advanced statistical analyses designed specifically to model complex data across a single continuum. We analyzed data from the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 1,056), focusing on indicators from the assessments of magnetic resonance imaging (MRI) volume, fluorodeoxyglucose positron emission tomography (FDG-PET) metabolic activity, cognitive performance, and adaptive function. Item response theory was used to identify the continuum of decline. Then, through a process of statistical scaling, indicators across all modalities were linked to that continuum and analyzed. Findings revealed that measures of MRI volume, FDG-PET metabolic activity, and adaptive function added measurement precision beyond that provided by cognitive measures, particularly in the relatively mild range of disease severity. More specifically, MRI volume, and FDG-PET metabolic activity become compromised in the very mild range of severity, followed by cognitive performance and finally adaptive function. Our statistically derived models of the AD pathological cascade are consistent with existing theoretical models.

  17. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media

    PubMed Central

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-01-01

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability—inherent to their nanoporosity—are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery. PMID:27327254

  18. High Variability in Cellular Stoichiometry of Carbon, Nitrogen, and Phosphorus Within Classes of Marine Eukaryotic Phytoplankton Under Sufficient Nutrient Conditions.

    PubMed

    Garcia, Nathan S; Sexton, Julie; Riggins, Tracey; Brown, Jeff; Lomas, Michael W; Martiny, Adam C

    2018-01-01

    Current hypotheses suggest that cellular elemental stoichiometry of marine eukaryotic phytoplankton such as the ratios of cellular carbon:nitrogen:phosphorus (C:N:P) vary between phylogenetic groups. To investigate how phylogenetic structure, cell volume, growth rate, and temperature interact to affect the cellular elemental stoichiometry of marine eukaryotic phytoplankton, we examined the C:N:P composition in 30 isolates across 7 classes of marine phytoplankton that were grown with a sufficient supply of nutrients and nitrate as the nitrogen source. The isolates covered a wide range in cell volume (5 orders of magnitude), growth rate (<0.01-0.9 d -1 ), and habitat temperature (2-24°C). Our analysis indicates that C:N:P is highly variable, with statistical model residuals accounting for over half of the total variance and no relationship between phylogeny and elemental stoichiometry. Furthermore, our data indicated that variability in C:P, N:P, and C:N within Bacillariophyceae (diatoms) was as high as that among all of the isolates that we examined. In addition, a linear statistical model identified a positive relationship between diatom cell volume and C:P and N:P. Among all of the isolates that we examined, the statistical model identified temperature as a significant factor, consistent with the temperature-dependent translation efficiency model, but temperature only explained 5% of the total statistical model variance. While some of our results support data from previous field studies, the high variability of elemental ratios within Bacillariophyceae contradicts previous work that suggests that this cosmopolitan group of microalgae has consistently low C:P and N:P ratios in comparison with other groups.

  19. Heavy residues from very mass asymmetric heavy ion reactions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hanold, Karl Alan

    1994-08-01

    The isotopic production cross sections and momenta of all residues with nuclear charge (Z) greater than 39 from the reaction of 26, 40, and 50 MeV/nucleon 129Xe + Be, C, and Al were measured. The isotopic cross sections, the momentum distribution for each isotope, and the cross section as a function of nuclear charge and momentum are presented here. The new cross sections are consistent with previous measurements of the cross sections from similar reaction systems. The shape of the cross section distribution, when considered as a function of Z and velocity, was found to be qualitatively consistent with thatmore » expected from an incomplete fusion reaction mechanism. An incomplete fusion model coupled to a statistical decay model is able to reproduce many features of these reactions: the shapes of the elemental cross section distributions, the emission velocity distributions for the intermediate mass fragments, and the Z versus velocity distributions. This model gives a less satisfactory prediction of the momentum distribution for each isotope. A very different model based on the Boltzman-Nordheim-Vlasov equation and which was also coupled to a statistical decay model reproduces many features of these reactions: the shapes of the elemental cross section distributions, the intermediate mass fragment emission velocity distributions, and the Z versus momentum distributions. Both model calculations over-estimate the average mass for each element by two mass units and underestimate the isotopic and isobaric widths of the experimental distributions. It is shown that the predicted average mass for each element can be brought into agreement with the data by small, but systematic, variation of the particle emission barriers used in the statistical model. The predicted isotopic and isobaric widths of the cross section distributions can not be brought into agreement with the experimental data using reasonable parameters for the statistical model.« less

  20. The Development of Statistics Textbook Supported with ICT and Portfolio-Based Assessment

    NASA Astrophysics Data System (ADS)

    Hendikawati, Putriaji; Yuni Arini, Florentina

    2016-02-01

    This research was development research that aimed to develop and produce a Statistics textbook model that supported with information and communication technology (ICT) and Portfolio-Based Assessment. This book was designed for students of mathematics at the college to improve students’ ability in mathematical connection and communication. There were three stages in this research i.e. define, design, and develop. The textbooks consisted of 10 chapters which each chapter contains introduction, core materials and include examples and exercises. The textbook developed phase begins with the early stages of designed the book (draft 1) which then validated by experts. Revision of draft 1 produced draft 2 which then limited test for readability test book. Furthermore, revision of draft 2 produced textbook draft 3 which simulated on a small sample to produce a valid model textbook. The data were analysed with descriptive statistics. The analysis showed that the Statistics textbook model that supported with ICT and Portfolio-Based Assessment valid and fill up the criteria of practicality.

  1. A Data Analytical Framework for Improving Real-Time, Decision Support Systems in Healthcare

    ERIC Educational Resources Information Center

    Yahav, Inbal

    2010-01-01

    In this dissertation we develop a framework that combines data mining, statistics and operations research methods for improving real-time decision support systems in healthcare. Our approach consists of three main concepts: data gathering and preprocessing, modeling, and deployment. We introduce the notion of offline and semi-offline modeling to…

  2. Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition.

    PubMed

    Falgreen, Steffen; Laursen, Maria Bach; Bødker, Julie Støve; Kjeldsen, Malene Krag; Schmitz, Alexander; Nyegaard, Mette; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin

    2014-06-05

    In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves' dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Time independent summary statistics may aid the understanding of drugs' action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies.

  3. Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition

    PubMed Central

    2014-01-01

    Background In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves’ dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. Results First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Conclusion Time independent summary statistics may aid the understanding of drugs’ action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies. PMID:24902483

  4. Bird-landscape relations in the Chihuahuan Desert: Coping with uncertainties about predictive models

    USGS Publications Warehouse

    Gutzwiller, K.J.; Barrow, W.C.

    2001-01-01

    During the springs of 1995-1997, we studied birds and landscapes in the Chihuahuan Desert along part of the Texas-Mexico border. Our objectives were to assess bird-landscape relations and their interannual consistency and to identify ways to cope with associated uncertainties that undermine confidence in using such relations in conservation decision processes. Bird distributions were often significantly associated with landscape features, and many bird-landscape models were valid and useful for predictive purposes. Differences in early spring rainfall appeared to influence bird abundance, but there was no evidence that annual differences in bird abundance affected model consistency. Model consistency for richness (42%) was higher than mean model consistency for 26 focal species (mean 30%, range 0-67%), suggesting that relations involving individual species are, on average, more subject to factors that cause variation than are richness-landscape relations. Consistency of bird-landscape relations may be influenced by such factors as plant succession, exotic species invasion, bird species' tolerances for environmental variation, habitat occupancy patterns, and variation in food density or weather. The low model consistency that we observed for most species indicates the high variation in bird-landscape relations that managers and other decision makers may encounter. The uncertainty of interannual variation in bird-landscape relations can be reduced by using projections of bird distributions from different annual models to determine the likely range of temporal and spatial variation in a species' distribution. Stochastic simulation models can be used to incorporate the uncertainty of random environmental variation into predictions of bird distributions based on bird-landscape relations and to provide probabilistic projections with which managers can weigh the costs and benefits of various decisions, Uncertainty about the true structure of bird-landscape relations (structural uncertainty) can be reduced by ensuring that models meet important statistical assumptions, designing studies with sufficient statistical power, validating the predictive ability of models, and improving model accuracy through continued field sampling and model fitting. Un certainty associated with sampling variation (partial observability) can be reduced by ensuring that sample sizes are large enough to provide precise estimates of both bird and landscape parameters. By decreasing the uncertainty due to partial observability, managers will improve their ability to reduce structural uncertainty.

  5. Implication of correlations among some common stability statistics - a Monte Carlo simulations.

    PubMed

    Piepho, H P

    1995-03-01

    Stability analysis of multilocation trials is often based on a mixed two-way model. Two stability measures in frequent use are the environmental variance (S i (2) )and the ecovalence (W i). Under the two-way model the rank orders of the expected values of these two statistics are identical for a given set of genotypes. By contrast, empirical rank correlations among these measures are consistently low. This suggests that the two-way mixed model may not be appropriate for describing real data. To check this hypothesis, a Monte Carlo simulation was conducted. It revealed that the low empirical rank correlation amongS i (2) and W i is most likely due to sampling errors. It is concluded that the observed low rank correlation does not invalidate the two-way model. The paper also discusses tests for homogeneity of S i (2) as well as implications of the two-way model for the classification of stability statistics.

  6. Prediction of crime occurrence from multi-modal data using deep learning

    PubMed Central

    Kang, Hyeon-Woo

    2017-01-01

    In recent years, various studies have been conducted on the prediction of crime occurrences. This predictive capability is intended to assist in crime prevention by facilitating effective implementation of police patrols. Previous studies have used data from multiple domains such as demographics, economics, and education. Their prediction models treat data from different domains equally. These methods have problems in crime occurrence prediction, such as difficulty in discovering highly nonlinear relationships, redundancies, and dependencies between multiple datasets. In order to enhance crime prediction models, we consider environmental context information, such as broken windows theory and crime prevention through environmental design. In this paper, we propose a feature-level data fusion method with environmental context based on a deep neural network (DNN). Our dataset consists of data collected from various online databases of crime statistics, demographic and meteorological data, and images in Chicago, Illinois. Prior to generating training data, we select crime-related data by conducting statistical analyses. Finally, we train our DNN, which consists of the following four kinds of layers: spatial, temporal, environmental context, and joint feature representation layers. Coupled with crucial data extracted from various domains, our fusion DNN is a product of an efficient decision-making process that statistically analyzes data redundancy. Experimental performance results show that our DNN model is more accurate in predicting crime occurrence than other prediction models. PMID:28437486

  7. Prediction of crime occurrence from multi-modal data using deep learning.

    PubMed

    Kang, Hyeon-Woo; Kang, Hang-Bong

    2017-01-01

    In recent years, various studies have been conducted on the prediction of crime occurrences. This predictive capability is intended to assist in crime prevention by facilitating effective implementation of police patrols. Previous studies have used data from multiple domains such as demographics, economics, and education. Their prediction models treat data from different domains equally. These methods have problems in crime occurrence prediction, such as difficulty in discovering highly nonlinear relationships, redundancies, and dependencies between multiple datasets. In order to enhance crime prediction models, we consider environmental context information, such as broken windows theory and crime prevention through environmental design. In this paper, we propose a feature-level data fusion method with environmental context based on a deep neural network (DNN). Our dataset consists of data collected from various online databases of crime statistics, demographic and meteorological data, and images in Chicago, Illinois. Prior to generating training data, we select crime-related data by conducting statistical analyses. Finally, we train our DNN, which consists of the following four kinds of layers: spatial, temporal, environmental context, and joint feature representation layers. Coupled with crucial data extracted from various domains, our fusion DNN is a product of an efficient decision-making process that statistically analyzes data redundancy. Experimental performance results show that our DNN model is more accurate in predicting crime occurrence than other prediction models.

  8. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method

    PubMed Central

    Roux, Benoît; Weare, Jonathan

    2013-01-01

    An issue of general interest in computer simulations is to incorporate information from experiments into a structural model. An important caveat in pursuing this goal is to avoid corrupting the resulting model with spurious and arbitrary biases. While the problem of biasing thermodynamic ensembles can be formulated rigorously using the maximum entropy method introduced by Jaynes, the approach can be cumbersome in practical applications with the need to determine multiple unknown coefficients iteratively. A popular alternative strategy to incorporate the information from experiments is to rely on restrained-ensemble molecular dynamics simulations. However, the fundamental validity of this computational strategy remains in question. Here, it is demonstrated that the statistical distribution produced by restrained-ensemble simulations is formally consistent with the maximum entropy method of Jaynes. This clarifies the underlying conditions under which restrained-ensemble simulations will yield results that are consistent with the maximum entropy method. PMID:23464140

  9. Assessment of hi-resolution multi-ensemble statistical downscaling regional climate scenarios over Japan

    NASA Astrophysics Data System (ADS)

    Dairaku, K.

    2017-12-01

    The Asia-Pacific regions are increasingly threatened by large scale natural disasters. Growing concerns that loss and damages of natural disasters are projected to further exacerbate by climate change and socio-economic change. Climate information and services for risk assessments are of great concern. Fundamental regional climate information is indispensable for understanding changing climate and making decisions on when and how to act. To meet with the needs of stakeholders such as National/local governments, spatio-temporal comprehensive and consistent information is necessary and useful for decision making. Multi-model ensemble regional climate scenarios with 1km horizontal grid-spacing over Japan are developed by using CMIP5 37 GCMs (RCP8.5) and a statistical downscaling (Bias Corrected Spatial Disaggregation (BCSD)) to investigate uncertainty of projected change associated with structural differences of the GCMs for the periods of historical climate (1950-2005) and near future climate (2026-2050). Statistical downscaling regional climate scenarios show good performance for annual and seasonal averages for precipitation and temperature. The regional climate scenarios show systematic underestimate of extreme events such as hot days of over 35 Celsius and annual maximum daily precipitation because of the interpolation processes in the BCSD method. Each model projected different responses in near future climate because of structural differences. The most of CMIP5 37 models show qualitatively consistent increase of average and extreme temperature and precipitation. The added values of statistical/dynamical downscaling methods are also investigated for locally forced nonlinear phenomena, extreme events.

  10. Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies.

    PubMed

    Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre

    2018-03-15

    Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. We propose a methodology based on Cox mixed models and written under the R language. This semiparametric model is indeed flexible enough to fit duration data. To compare log-linear and Cox mixed models in terms of goodness-of-fit on real data sets, we also provide a procedure based on simulations and quantile-quantile plots. We present two examples from a data set of speech and gesture interactions, which illustrate the limitations of linear and log-linear mixed models, as compared to Cox models. The linear models are not validated on our data, whereas Cox models are. Moreover, in the second example, the Cox model exhibits a significant effect that the linear model does not. We provide methods to select the best-fitting models for repeated duration data and to compare statistical methodologies. In this study, we show that Cox models are best suited to the analysis of our data set.

  11. Planck 2015 results. XVII. Constraints on primordial non-Gaussianity

    NASA Astrophysics Data System (ADS)

    Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Arroja, F.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Ballardini, M.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Basak, S.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Church, S.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Couchot, F.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Fergusson, J.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Galeotta, S.; Galli, S.; Ganga, K.; Gauthier, C.; Ghosh, T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hamann, J.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Heavens, A.; Helou, G.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huang, Z.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kim, J.; Kisner, T. S.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lacasa, F.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Lewis, A.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Marinucci, D.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; McGehee, P.; Meinhold, P. R.; Melchiorri, A.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mitra, S.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Münchmeyer, M.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Patanchon, G.; Peiris, H. V.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Popa, L.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Racine, B.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Shiraishi, M.; Smith, K.; Spencer, L. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutter, P.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Troja, A.; Tucci, M.; Tuovinen, J.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zonca, A.

    2016-09-01

    The Planck full mission cosmic microwave background (CMB) temperature and E-mode polarization maps are analysed to obtain constraints on primordial non-Gaussianity (NG). Using three classes of optimal bispectrum estimators - separable template-fitting (KSW), binned, and modal - we obtain consistent values for the primordial local, equilateral, and orthogonal bispectrum amplitudes, quoting as our final result from temperature alone ƒlocalNL = 2.5 ± 5.7, ƒequilNL= -16 ± 70, , and ƒorthoNL = -34 ± 32 (68% CL, statistical). Combining temperature and polarization data we obtain ƒlocalNL = 0.8 ± 5.0, ƒequilNL= -4 ± 43, and ƒorthoNL = -26 ± 21 (68% CL, statistical). The results are based on comprehensive cross-validation of these estimators on Gaussian and non-Gaussian simulations, are stable across component separation techniques, pass an extensive suite of tests, and are consistent with estimators based on measuring the Minkowski functionals of the CMB. The effect of time-domain de-glitching systematics on the bispectrum is negligible. In spite of these test outcomes we conservatively label the results including polarization data as preliminary, owing to a known mismatch of the noise model in simulations and the data. Beyond estimates of individual shape amplitudes, we present model-independent, three-dimensional reconstructions of the Planck CMB bispectrum and derive constraints on early universe scenarios that generate primordial NG, including general single-field models of inflation, axion inflation, initial state modifications, models producing parity-violating tensor bispectra, and directionally dependent vector models. We present a wide survey of scale-dependent feature and resonance models, accounting for the "look elsewhere" effect in estimating the statistical significance of features. We also look for isocurvature NG, and find no signal, but we obtain constraints that improve significantly with the inclusion of polarization. The primordial trispectrum amplitude in the local model is constrained to be 𝓰localNL = (-0.9 ± 7.7 ) X 104(68% CL statistical), and we perform an analysis of trispectrum shapes beyond the local case. The global picture that emerges is one of consistency with the premises of the ΛCDM cosmology, namely that the structure we observe today was sourced by adiabatic, passive, Gaussian, and primordial seed perturbations.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ade, P. A. R.; Aghanim, N.; Arnaud, M.

    We report that the Planck full mission cosmic microwave background (CMB) temperature and E-mode polarization maps are analysed to obtain constraints on primordial non-Gaussianity (NG). Using three classes of optimal bispectrum estimators – separable template-fitting (KSW), binned, and modal – we obtain consistent values for the primordial local, equilateral, and orthogonal bispectrum amplitudes, quoting as our final result from temperature alone ƒ local NL = 2.5 ± 5.7, ƒ equil NL= -16 ± 70, , and ƒ ortho NL = -34 ± 32 (68% CL, statistical). Combining temperature and polarization data we obtain ƒ local NL = 0.8 ± 5.0,more » ƒ equil NL= -4 ± 43, and ƒ ortho NL = -26 ± 21 (68% CL, statistical). The results are based on comprehensive cross-validation of these estimators on Gaussian and non-Gaussian simulations, are stable across component separation techniques, pass an extensive suite of tests, and are consistent with estimators based on measuring the Minkowski functionals of the CMB. The effect of time-domain de-glitching systematics on the bispectrum is negligible. In spite of these test outcomes we conservatively label the results including polarization data as preliminary, owing to a known mismatch of the noise model in simulations and the data. Beyond estimates of individual shape amplitudes, we present model-independent, three-dimensional reconstructions of the Planck CMB bispectrum and derive constraints on early universe scenarios that generate primordial NG, including general single-field models of inflation, axion inflation, initial state modifications, models producing parity-violating tensor bispectra, and directionally dependent vector models. We present a wide survey of scale-dependent feature and resonance models, accounting for the “look elsewhere” effect in estimating the statistical significance of features. We also look for isocurvature NG, and find no signal, but we obtain constraints that improve significantly with the inclusion of polarization. The primordial trispectrum amplitude in the local model is constrained to be g local NL = (-0.9 ± 7.7 ) X 10 4(68% CL statistical), and we perform an analysis of trispectrum shapes beyond the local case. The global picture that emerges is one of consistency with the premises of the ΛCDM cosmology, namely that the structure we observe today was sourced by adiabatic, passive, Gaussian, and primordial seed perturbations.« less

  13. Planck 2015 results: XVII. Constraints on primordial non-Gaussianity

    DOE PAGES

    Ade, P. A. R.; Aghanim, N.; Arnaud, M.; ...

    2016-09-20

    We report that the Planck full mission cosmic microwave background (CMB) temperature and E-mode polarization maps are analysed to obtain constraints on primordial non-Gaussianity (NG). Using three classes of optimal bispectrum estimators – separable template-fitting (KSW), binned, and modal – we obtain consistent values for the primordial local, equilateral, and orthogonal bispectrum amplitudes, quoting as our final result from temperature alone ƒ local NL = 2.5 ± 5.7, ƒ equil NL= -16 ± 70, , and ƒ ortho NL = -34 ± 32 (68% CL, statistical). Combining temperature and polarization data we obtain ƒ local NL = 0.8 ± 5.0,more » ƒ equil NL= -4 ± 43, and ƒ ortho NL = -26 ± 21 (68% CL, statistical). The results are based on comprehensive cross-validation of these estimators on Gaussian and non-Gaussian simulations, are stable across component separation techniques, pass an extensive suite of tests, and are consistent with estimators based on measuring the Minkowski functionals of the CMB. The effect of time-domain de-glitching systematics on the bispectrum is negligible. In spite of these test outcomes we conservatively label the results including polarization data as preliminary, owing to a known mismatch of the noise model in simulations and the data. Beyond estimates of individual shape amplitudes, we present model-independent, three-dimensional reconstructions of the Planck CMB bispectrum and derive constraints on early universe scenarios that generate primordial NG, including general single-field models of inflation, axion inflation, initial state modifications, models producing parity-violating tensor bispectra, and directionally dependent vector models. We present a wide survey of scale-dependent feature and resonance models, accounting for the “look elsewhere” effect in estimating the statistical significance of features. We also look for isocurvature NG, and find no signal, but we obtain constraints that improve significantly with the inclusion of polarization. The primordial trispectrum amplitude in the local model is constrained to be g local NL = (-0.9 ± 7.7 ) X 10 4(68% CL statistical), and we perform an analysis of trispectrum shapes beyond the local case. The global picture that emerges is one of consistency with the premises of the ΛCDM cosmology, namely that the structure we observe today was sourced by adiabatic, passive, Gaussian, and primordial seed perturbations.« less

  14. Micro-foundations for macroeconomics: New set-up based on statistical physics

    NASA Astrophysics Data System (ADS)

    Yoshikawa, Hiroshi

    2016-12-01

    Modern macroeconomics is built on "micro foundations." Namely, optimization of micro agent such as consumer and firm is explicitly analyzed in model. Toward this goal, standard model presumes "the representative" consumer/firm, and analyzes its behavior in detail. However, the macroeconomy consists of 107 consumers and 106 firms. For the purpose of analyzing such macro system, it is meaningless to pursue the micro behavior in detail. In this respect, there is no essential difference between economics and physics. The method of statistical physics can be usefully applied to the macroeconomy, and provides Keynesian economics with correct micro-foundations.

  15. Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

    PubMed Central

    Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

    2009-01-01

    Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660

  16. Statistical shear lag model - unraveling the size effect in hierarchical composites.

    PubMed

    Wei, Xiaoding; Filleter, Tobin; Espinosa, Horacio D

    2015-05-01

    Numerous experimental and computational studies have established that the hierarchical structures encountered in natural materials, such as the brick-and-mortar structure observed in sea shells, are essential for achieving defect tolerance. Due to this hierarchy, the mechanical properties of natural materials have a different size dependence compared to that of typical engineered materials. This study aimed to explore size effects on the strength of bio-inspired staggered hierarchical composites and to define the influence of the geometry of constituents in their outstanding defect tolerance capability. A statistical shear lag model is derived by extending the classical shear lag model to account for the statistics of the constituents' strength. A general solution emerges from rigorous mathematical derivations, unifying the various empirical formulations for the fundamental link length used in previous statistical models. The model shows that the staggered arrangement of constituents grants composites a unique size effect on mechanical strength in contrast to homogenous continuous materials. The model is applied to hierarchical yarns consisting of double-walled carbon nanotube bundles to assess its predictive capabilities for novel synthetic materials. Interestingly, the model predicts that yarn gauge length does not significantly influence the yarn strength, in close agreement with experimental observations. Copyright © 2015 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

  17. The invariant statistical rule of aerosol scattering pulse signal modulated by random noise

    NASA Astrophysics Data System (ADS)

    Yan, Zhen-gang; Bian, Bao-Min; Yang, Juan; Peng, Gang; Li, Zhen-hua

    2010-11-01

    A model of the random background noise acting on particle signals is established to study the impact of the background noise of the photoelectric sensor in the laser airborne particle counter on the statistical character of the aerosol scattering pulse signals. The results show that the noises broaden the statistical distribution of the particle's measurement. Further numerical research shows that the output of the signal amplitude still has the same distribution when the airborne particle with the lognormal distribution was modulated by random noise which has lognormal distribution. Namely it follows the statistics law of invariance. Based on this model, the background noise of photoelectric sensor and the counting distributions of random signal for aerosol's scattering pulse are obtained and analyzed by using a high-speed data acquisition card PCI-9812. It is found that the experiment results and simulation results are well consistent.

  18. Linking statistically-and physically-based models for improved streamflow simulation in gaged and ungaged watersheds

    Treesearch

    Jacob LaFontaine; Lauren Hay; Stacey Archfield; William Farmer; Julie Kiang

    2016-01-01

    The U.S. Geological Survey (USGS) has developed a National Hydrologic Model (NHM) to support coordinated, comprehensive and consistent hydrologic model development, and facilitate the application of hydrologic simulations within the continental US. The portion of the NHM located within the Gulf Coastal Plains and Ozarks Landscape Conservation Cooperative (GCPO LCC) is...

  19. SGR-like behaviour of the repeating FRB 121102

    NASA Astrophysics Data System (ADS)

    Wang, F. Y.; Yu, H.

    2017-03-01

    Fast radio bursts (FRBs) are millisecond-duration radio signals occurring at cosmological distances. However the physical model of FRBs is mystery, many models have been proposed. Here we study the frequency distributions of peak flux, fluence, duration and waiting time for the repeating FRB 121102. The cumulative distributions of peak flux, fluence and duration show power-law forms. The waiting time distribution also shows power-law distribution, and is consistent with a non-stationary Poisson process. These distributions are similar as those of soft gamma repeaters (SGRs). We also use the statistical results to test the proposed models for FRBs. These distributions are consistent with the predictions from avalanche models of slowly driven nonlinear dissipative systems.

  20. Alpha 2 LASSO Data Bundles

    DOE Data Explorer

    Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Kim, Jinwon; Krishna, Bhargavi

    2015-08-31

    The Alpha 2 release is the second release from the LASSO Pilot Phase that builds upon the Alpha 1 release. Alpha 2 contains additional diagnostics in the data bundles and focuses on cases from spring-summer 2016. A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input include model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.

  1. Digital morphogenesis via Schelling segregation

    NASA Astrophysics Data System (ADS)

    Barmpalias, George; Elwes, Richard; Lewis-Pye, Andrew

    2018-04-01

    Schelling’s model of segregation looks to explain the way in which particles or agents of two types may come to arrange themselves spatially into configurations consisting of large homogeneous clusters, i.e. connected regions consisting of only one type. As one of the earliest agent based models studied by economists and perhaps the most famous model of self-organising behaviour, it also has direct links to areas at the interface between computer science and statistical mechanics, such as the Ising model and the study of contagion and cascading phenomena in networks. While the model has been extensively studied it has largely resisted rigorous analysis, prior results from the literature generally pertaining to variants of the model which are tweaked so as to be amenable to standard techniques from statistical mechanics or stochastic evolutionary game theory. In Brandt et al (2012 Proc. 44th Annual ACM Symp. on Theory of Computing) provided the first rigorous analysis of the unperturbed model, for a specific set of input parameters. Here we provide a rigorous analysis of the model’s behaviour much more generally and establish some surprising forms of threshold behaviour, notably the existence of situations where an increased level of intolerance for neighbouring agents of opposite type leads almost certainly to decreased segregation.

  2. Consistent integration of experimental and ab initio data into molecular and coarse-grained models

    NASA Astrophysics Data System (ADS)

    Vlcek, Lukas

    As computer simulations are increasingly used to complement or replace experiments, highly accurate descriptions of physical systems at different time and length scales are required to achieve realistic predictions. The questions of how to objectively measure model quality in relation to reference experimental or ab initio data, and how to transition seamlessly between different levels of resolution are therefore of prime interest. To address these issues, we use the concept of statistical distance to define a measure of similarity between statistical mechanical systems, i.e., a model and its target, and show that its minimization leads to general convergence of the systems' measurable properties. Through systematic coarse-graining, we arrive at appropriate expressions for optimization loss functions consistently incorporating microscopic ab initio data as well as macroscopic experimental data. The design of coarse-grained and multiscale models is then based on factoring the model system partition function into terms describing the system at different resolution levels. The optimization algorithm takes advantage of thermodynamic perturbation expressions for fast exploration of the model parameter space, enabling us to scan millions of parameter combinations per hour on a single CPU. The robustness and generality of the new model optimization framework and its efficient implementation are illustrated on selected examples including aqueous solutions, magnetic systems, and metal alloys.

  3. Estimation and model selection of semiparametric multivariate survival functions under general censorship.

    PubMed

    Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

    2010-07-01

    We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.

  4. Estimation and model selection of semiparametric multivariate survival functions under general censorship

    PubMed Central

    Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

    2013-01-01

    We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286

  5. Consistent Partial Least Squares Path Modeling via Regularization

    PubMed Central

    Jung, Sunho; Park, JaeHong

    2018-01-01

    Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present. PMID:29515491

  6. Testing the Predictive Power of Coulomb Stress on Aftershock Sequences

    NASA Astrophysics Data System (ADS)

    Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.

    2009-12-01

    Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.

  7. Data Model Performance in Data Warehousing

    NASA Astrophysics Data System (ADS)

    Rorimpandey, G. C.; Sangkop, F. I.; Rantung, V. P.; Zwart, J. P.; Liando, O. E. S.; Mewengkang, A.

    2018-02-01

    Data Warehouses have increasingly become important in organizations that have large amount of data. It is not a product but a part of a solution for the decision support system in those organizations. Data model is the starting point for designing and developing of data warehouses architectures. Thus, the data model needs stable interfaces and consistent for a longer period of time. The aim of this research is to know which data model in data warehousing has the best performance. The research method is descriptive analysis, which has 3 main tasks, such as data collection and organization, analysis of data and interpretation of data. The result of this research is discussed in a statistic analysis method, represents that there is no statistical difference among data models used in data warehousing. The organization can utilize four data model proposed when designing and developing data warehouse.

  8. Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.

    PubMed

    Ruhi, S; Karim, M R

    2016-01-01

    Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.

  9. Earthquake likelihood model testing

    USGS Publications Warehouse

    Schorlemmer, D.; Gerstenberger, M.C.; Wiemer, S.; Jackson, D.D.; Rhoades, D.A.

    2007-01-01

    INTRODUCTIONThe Regional Earthquake Likelihood Models (RELM) project aims to produce and evaluate alternate models of earthquake potential (probability per unit volume, magnitude, and time) for California. Based on differing assumptions, these models are produced to test the validity of their assumptions and to explore which models should be incorporated in seismic hazard and risk evaluation. Tests based on physical and geological criteria are useful but we focus on statistical methods using future earthquake catalog data only. We envision two evaluations: a test of consistency with observed data and a comparison of all pairs of models for relative consistency. Both tests are based on the likelihood method, and both are fully prospective (i.e., the models are not adjusted to fit the test data). To be tested, each model must assign a probability to any possible event within a specified region of space, time, and magnitude. For our tests the models must use a common format: earthquake rates in specified “bins” with location, magnitude, time, and focal mechanism limits.Seismology cannot yet deterministically predict individual earthquakes; however, it should seek the best possible models for forecasting earthquake occurrence. This paper describes the statistical rules of an experiment to examine and test earthquake forecasts. The primary purposes of the tests described below are to evaluate physical models for earthquakes, assure that source models used in seismic hazard and risk studies are consistent with earthquake data, and provide quantitative measures by which models can be assigned weights in a consensus model or be judged as suitable for particular regions.In this paper we develop a statistical method for testing earthquake likelihood models. A companion paper (Schorlemmer and Gerstenberger 2007, this issue) discusses the actual implementation of these tests in the framework of the RELM initiative.Statistical testing of hypotheses is a common task and a wide range of possible testing procedures exist. Jolliffe and Stephenson (2003) present different forecast verifications from atmospheric science, among them likelihood testing of probability forecasts and testing the occurrence of binary events. Testing binary events requires that for each forecasted event, the spatial, temporal and magnitude limits be given. Although major earthquakes can be considered binary events, the models within the RELM project express their forecasts on a spatial grid and in 0.1 magnitude units; thus the results are a distribution of rates over space and magnitude. These forecasts can be tested with likelihood tests.In general, likelihood tests assume a valid null hypothesis against which a given hypothesis is tested. The outcome is either a rejection of the null hypothesis in favor of the test hypothesis or a nonrejection, meaning the test hypothesis cannot outperform the null hypothesis at a given significance level. Within RELM, there is no accepted null hypothesis and thus the likelihood test needs to be expanded to allow comparable testing of equipollent hypotheses.To test models against one another, we require that forecasts are expressed in a standard format: the average rate of earthquake occurrence within pre-specified limits of hypocentral latitude, longitude, depth, magnitude, time period, and focal mechanisms. Focal mechanisms should either be described as the inclination of P-axis, declination of P-axis, and inclination of the T-axis, or as strike, dip, and rake angles. Schorlemmer and Gerstenberger (2007, this issue) designed classes of these parameters such that similar models will be tested against each other. These classes make the forecasts comparable between models. Additionally, we are limited to testing only what is precisely defined and consistently reported in earthquake catalogs. Therefore it is currently not possible to test such information as fault rupture length or area, asperity location, etc. Also, to account for data quality issues, we allow for location and magnitude uncertainties as well as the probability that an event is dependent on another event.As we mentioned above, only models with comparable forecasts can be tested against each other. Our current tests are designed to examine grid-based models. This requires that any fault-based model be adapted to a grid before testing is possible. While this is a limitation of the testing, it is an inherent difficulty in any such comparative testing. Please refer to appendix B for a statistical evaluation of the application of the Poisson hypothesis to fault-based models.The testing suite we present consists of three different tests: L-Test, N-Test, and R-Test. These tests are defined similarily to Kagan and Jackson (1995). The first two tests examine the consistency of the hypotheses with the observations while the last test compares the spatial performances of the models.

  10. The ratio of profile peak separations as a probe of pulsar radio-beam structure

    NASA Astrophysics Data System (ADS)

    Dyks, J.; Pierbattista, M.

    2015-12-01

    The known population of pulsars contains objects with four- and five-component profiles, for which the peak-to-peak separations between the inner and outer components can be measured. These Q- and M-type profiles can be interpreted as a result of sightline cut through a nested-cone beam, or through a set of azimuthal fan beams. We show that the ratio RW of the components' separations provides a useful measure of the beam shape, which is mostly independent of parameters that determine the beam scale and complicate interpretation of simpler profiles. In particular, the method does not depend on the emission altitude and the dipole tilt distribution. The different structures of the radio beam imply manifestly different statistical distributions of RW, with the conal model being several orders of magnitude less consistent with data than the fan-beam model. To bring the conal model into consistency with data, strong effects of observational selection need to be called for, with 80 per cent of Q and M profiles assumed to be undetected because of intrinsic blending effects. It is concluded that the statistical properties of Q and M profiles are more consistent with the fan-shaped beams, than with the traditional nested-cone geometry.

  11. Supernova Driving. IV. The Star-formation Rate of Molecular Clouds

    NASA Astrophysics Data System (ADS)

    Padoan, Paolo; Haugbølle, Troels; Nordlund, Åke; Frimann, Søren

    2017-05-01

    We compute the star-formation rate (SFR) in molecular clouds (MCs) that originate ab initio in a new, higher-resolution simulation of supernova-driven turbulence. Because of the large number of well-resolved clouds with self-consistent boundary and initial conditions, we obtain a large range of cloud physical parameters with realistic statistical distributions, which is an unprecedented sample of star-forming regions to test SFR models and to interpret observational surveys. We confirm the dependence of the SFR per free-fall time, SFRff, on the virial parameter, α vir, found in previous simulations, and compare a revised version of our turbulent fragmentation model with the numerical results. The dependences on Mach number, { M }, gas to magnetic pressure ratio, β, and compressive to solenoidal power ratio, χ at fixed α vir are not well constrained, because of random scatter due to time and cloud-to-cloud variations in SFRff. We find that SFRff in MCs can take any value in the range of 0 ≤ SFRff ≲ 0.2, and its probability distribution peaks at a value of SFRff ≈ 0.025, consistent with observations. The values of SFRff and the scatter in the SFRff-α vir relation are consistent with recent measurements in nearby MCs and in clouds near the Galactic center. Although not explicitly modeled by the theory, the scatter is consistent with the physical assumptions of our revised model and may also result in part from a lack of statistical equilibrium of the turbulence, due to the transient nature of MCs.

  12. Test of the statistical model in {sup 96}Mo with the BaF{sub 2}{gamma} calorimeter DANCE array

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sheets, S. A.; Mitchell, G. E.; Agvaanluvsan, U.

    2009-02-15

    The {gamma}-ray cascades following the {sup 95}Mo(n,{gamma}){sup 96}Mo reaction were studied with the {gamma} calorimeter DANCE (Detector for Advanced Neutron Capture Experiments) consisting of 160 BaF{sub 2} scintillation detectors at the Los Alamos Neutron Science Center. The {gamma}-ray energy spectra for different multiplicities were measured for s- and p-wave resonances below 2 keV. The shapes of these spectra were found to be in very good agreement with simulations using the DICEBOX statistical model code. The relevant model parameters used for the level density and photon strength functions were identical with those that provided the best fit of the data frommore » a recent measurement of the thermal {sup 95}Mo(n,{gamma}){sup 96}Mo reaction with the two-step-cascade method. The reported results strongly suggest that the extreme statistical model works very well in the mass region near A=100.« less

  13. Long-term evolution of a planetesimal swarm in the vicinity of a protoplanet

    NASA Technical Reports Server (NTRS)

    Kary, David M.; Lissauer, Jack J.

    1991-01-01

    Many models of planet formation involve scenarios in which one or a few large protoplanets interact with a swarm of much smaller planetesimals. In such scenarios, three-body perturbations by the protoplanet as well as mutual collisions and gravitational interactions between the swarm bodies are important in determining the velocity distribution of the swarm. We are developing a model to examine the effects of these processes on the evolution of a planetesimal swarm. The model consists of a combination of numerical integrations of the gravitational influence of one (or a few) massive protoplanets on swarm bodies together with a statistical treatment of the interactions between the planetesimals. Integrating the planetesimal orbits allows us to take into account effects that are difficult to model analytically or statistically, such as three-body collision cross-sections and resonant perturbations by the protoplanet, while using a statistical treatment for the particle-particle interactions allows us to use a large enough sample to obtain meaningful results.

  14. Cloud encounter statistics in the 28.5-43.5 KFT altitude region from four years of GASP observations

    NASA Technical Reports Server (NTRS)

    Jasperson, W. H.; Nastrom, G. D.; Davis, R. E.; Holdeman, J. D.

    1983-01-01

    The results of an analysis of cloud encounter measurements taken at aircraft flight altitudes as part of the Global Atmospheric Sampling Program are summarized. The results can be used in estimating the probability of cloud encounter and in assessing the economic feasibility of laminar flow control aircraft along particular routes. The data presented clearly show the tropical circulation and its seasonal migration; characteristics of the mid-latitude regime, such as the large-scale traveling cyclones in the winter and increased convective activity in the summer, can be isolated in the data. The cloud encounter statistics are shown to be consistent with the mid-latitude cyclone model. A model for TIC (time-in-clouds), a cloud encounter statistic, is presented for several common airline routes.

  15. Self-Consistent Field Lattice Model for Polymer Networks.

    PubMed

    Tito, Nicholas B; Storm, Cornelis; Ellenbroek, Wouter G

    2017-12-26

    A lattice model based on polymer self-consistent field theory is developed to predict the equilibrium statistics of arbitrary polymer networks. For a given network topology, our approach uses moment propagators on a lattice to self-consistently construct the ensemble of polymer conformations and cross-link spatial probability distributions. Remarkably, the calculation can be performed "in the dark", without any prior knowledge on preferred chain conformations or cross-link positions. Numerical results from the model for a test network exhibit close agreement with molecular dynamics simulations, including when the network is strongly sheared. Our model captures nonaffine deformation, mean-field monomer interactions, cross-link fluctuations, and finite extensibility of chains, yielding predictions that differ markedly from classical rubber elasticity theory for polymer networks. By examining polymer networks with different degrees of interconnectivity, we gain insight into cross-link entropy, an important quantity in the macroscopic behavior of gels and self-healing materials as they are deformed.

  16. Insights into Corona Formation through Statistical Analyses

    NASA Technical Reports Server (NTRS)

    Glaze, L. S.; Stofan, E. R.; Smrekar, S. E.; Baloga, S. M.

    2002-01-01

    Statistical analysis of an expanded database of coronae on Venus indicates that the populations of Type 1 (with fracture annuli) and 2 (without fracture annuli) corona diameters are statistically indistinguishable, and therefore we have no basis for assuming different formation mechanisms. Analysis of the topography and diameters of coronae shows that coronae that are depressions, rimmed depressions, and domes tend to be significantly smaller than those that are plateaus, rimmed plateaus, or domes with surrounding rims. This is consistent with the model of Smrekar and Stofan and inconsistent with predictions of the spreading drop model of Koch and Manga. The diameter range for domes, the initial stage of corona formation, provides a broad constraint on the buoyancy of corona-forming plumes. Coronae are only slightly more likely to be topographically raised than depressions, with Type 1 coronae most frequently occurring as rimmed depressions and Type 2 coronae most frequently occuring with flat interiors and raised rims. Most Type 1 coronae are located along chasmata systems or fracture belts, while Type 2 coronas are found predominantly as isolated features in the plains. Coronae at hotspot rises tend to be significantly larger than coronae in other settings, consistent with a hotter upper mantle at hotspot rises and their active state.

  17. Turbulent scaling laws as solutions of the multi-point correlation equation using statistical symmetries

    NASA Astrophysics Data System (ADS)

    Oberlack, Martin; Rosteck, Andreas; Avsarkisov, Victor

    2013-11-01

    Text-book knowledge proclaims that Lie symmetries such as Galilean transformation lie at the heart of fluid dynamics. These important properties also carry over to the statistical description of turbulence, i.e. to the Reynolds stress transport equations and its generalization, the multi-point correlation equations (MPCE). Interesting enough, the MPCE admit a much larger set of symmetries, in fact infinite dimensional, subsequently named statistical symmetries. Most important, theses new symmetries have important consequences for our understanding of turbulent scaling laws. The symmetries form the essential foundation to construct exact solutions to the infinite set of MPCE, which in turn are identified as classical and new turbulent scaling laws. Examples on various classical and new shear flow scaling laws including higher order moments will be presented. Even new scaling have been forecasted from these symmetries and in turn validated by DNS. Turbulence modellers have implicitly recognized at least one of the statistical symmetries as this is the basis for the usual log-law which has been employed for calibrating essentially all engineering turbulence models. An obvious conclusion is to generally make turbulence models consistent with the new statistical symmetries.

  18. Mapping irrigated lands at 250-m scale by merging MODIS data and National Agricultural Statistics

    USGS Publications Warehouse

    Pervez, Md Shahriar; Brown, Jesslyn F.

    2010-01-01

    Accurate geospatial information on the extent of irrigated land improves our understanding of agricultural water use, local land surface processes, conservation or depletion of water resources, and components of the hydrologic budget. We have developed a method in a geospatial modeling framework that assimilates irrigation statistics with remotely sensed parameters describing vegetation growth conditions in areas with agricultural land cover to spatially identify irrigated lands at 250-m cell size across the conterminous United States for 2002. The geospatial model result, known as the Moderate Resolution Imaging Spectroradiometer (MODIS) Irrigated Agriculture Dataset (MIrAD-US), identified irrigated lands with reasonable accuracy in California and semiarid Great Plains states with overall accuracies of 92% and 75% and kappa statistics of 0.75 and 0.51, respectively. A quantitative accuracy assessment of MIrAD-US for the eastern region has not yet been conducted, and qualitative assessment shows that model improvements are needed for the humid eastern regions where the distinction in annual peak NDVI between irrigated and non-irrigated crops is minimal and county sizes are relatively small. This modeling approach enables consistent mapping of irrigated lands based upon USDA irrigation statistics and should lead to better understanding of spatial trends in irrigated lands across the conterminous United States. An improved version of the model with revised datasets is planned and will employ 2007 USDA irrigation statistics.

  19. Consistency of the Planck CMB data and ΛCDM cosmology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shafieloo, Arman; Hazra, Dhiraj Kumar, E-mail: shafieloo@kasi.re.kr, E-mail: dhiraj.kumar.hazra@apc.univ-paris7.fr

    We test the consistency between Planck temperature and polarization power spectra and the concordance model of Λ Cold Dark Matter cosmology (ΛCDM) within the framework of Crossing statistics. We find that Planck TT best fit ΛCDM power spectrum is completely consistent with EE power spectrum data while EE best fit ΛCDM power spectrum is not consistent with TT data. However, this does not point to any systematic or model-data discrepancy since in the Planck EE data, uncertainties are much larger compared to the TT data. We also investigate the possibility of any deviation from ΛCDM model analyzing the Planck 2015more » data. Results from TT, TE and EE data analysis indicate that no deviation is required beyond the flexibility of the concordance ΛCDM model. Our analysis thus rules out any strong evidence for beyond the concordance model in the Planck spectra data. We also report a mild amplitude difference comparing temperature and polarization data, where temperature data seems to have slightly lower amplitude than expected (consistently at all multiples), as we assume both temperature and polarization data are realizations of the same underlying cosmology.« less

  20. Supervised variational model with statistical inference and its application in medical image segmentation.

    PubMed

    Li, Changyang; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Yin, Yong; Dagan Feng, David

    2015-01-01

    Automated and general medical image segmentation can be challenging because the foreground and the background may have complicated and overlapping density distributions in medical imaging. Conventional region-based level set algorithms often assume piecewise constant or piecewise smooth for segments, which are implausible for general medical image segmentation. Furthermore, low contrast and noise make identification of the boundaries between foreground and background difficult for edge-based level set algorithms. Thus, to address these problems, we suggest a supervised variational level set segmentation model to harness the statistical region energy functional with a weighted probability approximation. Our approach models the region density distributions by using the mixture-of-mixtures Gaussian model to better approximate real intensity distributions and distinguish statistical intensity differences between foreground and background. The region-based statistical model in our algorithm can intuitively provide better performance on noisy images. We constructed a weighted probability map on graphs to incorporate spatial indications from user input with a contextual constraint based on the minimization of contextual graphs energy functional. We measured the performance of our approach on ten noisy synthetic images and 58 medical datasets with heterogeneous intensities and ill-defined boundaries and compared our technique to the Chan-Vese region-based level set model, the geodesic active contour model with distance regularization, and the random walker model. Our method consistently achieved the highest Dice similarity coefficient when compared to the other methods.

  1. Quantum-Like Bayesian Networks for Modeling Decision Making

    PubMed Central

    Moreira, Catarina; Wichert, Andreas

    2016-01-01

    In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669

  2. A two-component rain model for the prediction of attenuation and diversity improvement

    NASA Technical Reports Server (NTRS)

    Crane, R. K.

    1982-01-01

    A new model was developed to predict attenuation statistics for a single Earth-satellite or terrestrial propagation path. The model was extended to provide predictions of the joint occurrences of specified or higher attenuation values on two closely spaced Earth-satellite paths. The joint statistics provide the information required to obtain diversity gain or diversity advantage estimates. The new model is meteorologically based. It was tested against available Earth-satellite beacon observations and terrestrial path measurements. The model employs the rain climate region descriptions of the Global rain model. The rms deviation between the predicted and observed attenuation values for the terrestrial path data was 35 percent, a result consistent with the expectations of the Global model when the rain rate distribution for the path is not used in the calculation. Within the United States the rms deviation between measurement and prediction was 36 percent but worldwide it was 79 percent.

  3. Efficient Geological Modelling of Large AEM Surveys

    NASA Astrophysics Data System (ADS)

    Bach, Torben; Martlev Pallesen, Tom; Jørgensen, Flemming; Lundh Gulbrandsen, Mats; Mejer Hansen, Thomas

    2014-05-01

    Combining geological expert knowledge with geophysical observations into a final 3D geological model is, in most cases, not a straight forward process. It typically involves many types of data and requires both an understanding of the data and the geological target. When dealing with very large areas, such as modelling of large AEM surveys, the manual task for the geologist to correctly evaluate and properly utilise all the data available in the survey area, becomes overwhelming. In the ERGO project (Efficient High-Resolution Geological Modelling) we address these issues and propose a new modelling methodology enabling fast and consistent modelling of very large areas. The vision of the project is to build a user friendly expert system that enables the combination of very large amounts of geological and geophysical data with geological expert knowledge. This is done in an "auto-pilot" type functionality, named Smart Interpretation, designed to aid the geologist in the interpretation process. The core of the expert system is a statistical model that describes the relation between data and geological interpretation made by a geological expert. This facilitates fast and consistent modelling of very large areas. It will enable the construction of models with high resolution as the system will "learn" the geology of an area directly from interpretations made by a geological expert, and instantly apply it to all hard data in the survey area, ensuring the utilisation of all the data available in the geological model. Another feature is that the statistical model the system creates for one area can be used in another area with similar data and geology. This feature can be useful as an aid to an untrained geologist to build a geological model, guided by the experienced geologist way of interpretation, as quantified by the expert system in the core statistical model. In this project presentation we provide some examples of the problems we are aiming to address in the project, and show some preliminary results.

  4. Comparison of Response Surface and Kriging Models in the Multidisciplinary Design of an Aerospike Nozzle

    NASA Technical Reports Server (NTRS)

    Simpson, Timothy W.

    1998-01-01

    The use of response surface models and kriging models are compared for approximating non-random, deterministic computer analyses. After discussing the traditional response surface approach for constructing polynomial models for approximation, kriging is presented as an alternative statistical-based approximation method for the design and analysis of computer experiments. Both approximation methods are applied to the multidisciplinary design and analysis of an aerospike nozzle which consists of a computational fluid dynamics model and a finite element analysis model. Error analysis of the response surface and kriging models is performed along with a graphical comparison of the approximations. Four optimization problems are formulated and solved using both approximation models. While neither approximation technique consistently outperforms the other in this example, the kriging models using only a constant for the underlying global model and a Gaussian correlation function perform as well as the second order polynomial response surface models.

  5. Statistically optimal estimation of Greenland Ice Sheet mass variations from GRACE monthly solutions using an improved mascon approach

    NASA Astrophysics Data System (ADS)

    Ran, J.; Ditmar, P.; Klees, R.; Farahani, H. H.

    2018-03-01

    We present an improved mascon approach to transform monthly spherical harmonic solutions based on GRACE satellite data into mass anomaly estimates in Greenland. The GRACE-based spherical harmonic coefficients are used to synthesize gravity anomalies at satellite altitude, which are then inverted into mass anomalies per mascon. The limited spectral content of the gravity anomalies is properly accounted for by applying a low-pass filter as part of the inversion procedure to make the functional model spectrally consistent with the data. The full error covariance matrices of the monthly GRACE solutions are properly propagated using the law of covariance propagation. Using numerical experiments, we demonstrate the importance of a proper data weighting and of the spectral consistency between functional model and data. The developed methodology is applied to process real GRACE level-2 data (CSR RL05). The obtained mass anomaly estimates are integrated over five drainage systems, as well as over entire Greenland. We find that the statistically optimal data weighting reduces random noise by 35-69%, depending on the drainage system. The obtained mass anomaly time-series are de-trended to eliminate the contribution of ice discharge and are compared with de-trended surface mass balance (SMB) time-series computed with the Regional Atmospheric Climate Model (RACMO 2.3). We show that when using a statistically optimal data weighting in GRACE data processing, the discrepancies between GRACE-based estimates of SMB and modelled SMB are reduced by 24-47%.

  6. The power and robustness of maximum LOD score statistics.

    PubMed

    Yoo, Y J; Mendell, N R

    2008-07-01

    The maximum LOD score statistic is extremely powerful for gene mapping when calculated using the correct genetic parameter value. When the mode of genetic transmission is unknown, the maximum of the LOD scores obtained using several genetic parameter values is reported. This latter statistic requires higher critical value than the maximum LOD score statistic calculated from a single genetic parameter value. In this paper, we compare the power of maximum LOD scores based on three fixed sets of genetic parameter values with the power of the LOD score obtained after maximizing over the entire range of genetic parameter values. We simulate family data under nine generating models. For generating models with non-zero phenocopy rates, LOD scores maximized over the entire range of genetic parameters yielded greater power than maximum LOD scores for fixed sets of parameter values with zero phenocopy rates. No maximum LOD score was consistently more powerful than the others for generating models with a zero phenocopy rate. The power loss of the LOD score maximized over the entire range of genetic parameters, relative to the maximum LOD score calculated using the correct genetic parameter value, appeared to be robust to the generating models.

  7. An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2015-11-01

    In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation ofmore » the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.« less

  8. Statistics of Optical Coherence Tomography Data From Human Retina

    PubMed Central

    de Juan, Joaquín; Ferrone, Claudia; Giannini, Daniela; Huang, David; Koch, Giorgio; Russo, Valentina; Tan, Ou; Bruni, Carlo

    2010-01-01

    Optical coherence tomography (OCT) has recently become one of the primary methods for noninvasive probing of the human retina. The pseudoimage formed by OCT (the so-called B-scan) varies probabilistically across pixels due to complexities in the measurement technique. Hence, sensitive automatic procedures of diagnosis using OCT may exploit statistical analysis of the spatial distribution of reflectance. In this paper, we perform a statistical study of retinal OCT data. We find that the stretched exponential probability density function can model well the distribution of intensities in OCT pseudoimages. Moreover, we show a small, but significant correlation between neighbor pixels when measuring OCT intensities with pixels of about 5 µm. We then develop a simple joint probability model for the OCT data consistent with known retinal features. This model fits well the stretched exponential distribution of intensities and their spatial correlation. In normal retinas, fit parameters of this model are relatively constant along retinal layers, but varies across layers. However, in retinas with diabetic retinopathy, large spikes of parameter modulation interrupt the constancy within layers, exactly where pathologies are visible. We argue that these results give hope for improvement in statistical pathology-detection methods even when the disease is in its early stages. PMID:20304733

  9. Local dependence in random graph models: characterization, properties and statistical inference

    PubMed Central

    Schweinberger, Michael; Handcock, Mark S.

    2015-01-01

    Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142

  10. Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering

    DTIC Science & Technology

    2005-08-04

    describe a four-band magnetic resonance image (MRI) consisting of 23,712 pixels of a brain with a tumor 2. Because of the size of the dataset, it is not...the Royal Statistical Society, Series B 56, 363–375. Figueiredo, M. A. T. and A. K. Jain (2002). Unsupervised learning of finite mixture models. IEEE...20 5.4 Brain MRI

  11. Fully Bayesian Estimation of Data from Single Case Designs

    ERIC Educational Resources Information Center

    Rindskopf, David

    2013-01-01

    Single case designs (SCDs) generally consist of a small number of short time series in two or more phases. The analysis of SCDs statistically fits in the framework of a multilevel model, or hierarchical model. The usual analysis does not take into account the uncertainty in the estimation of the random effects. This not only has an effect on the…

  12. AIDS susceptibility in a migrant population: perception and behavior.

    PubMed

    McBride, D C; Weatherby, N L; Inciardi, J A; Gillespie, S A

    1999-01-01

    Within the framework of the Health Belief Model, this paper examines correlates of perception of AIDS susceptibility among 846 drug-using migrant farm workers and their sex partners. Significant but relatively small differences by ethnicity and gender were found. The data showed a consistent significant statistical relationship between frequency of drug use, high-risk sexual behavior, and perception of AIDS susceptibility. Perception of AIDS susceptibility was significantly related to a subsequent reduction in sexual risk behaviors. Consistent with the Health Belief Model, the data suggest that increasing perception of AIDS susceptibility may be an important motivator in reducing high-risk behaviors.

  13. A new statistical method for transfer coefficient calculations in the framework of the general multiple-compartment model of transport for radionuclides in biological systems.

    PubMed

    Garcia, F; Arruda-Neto, J D; Manso, M V; Helene, O M; Vanin, V R; Rodriguez, O; Mesa, J; Likhachev, V P; Filho, J W; Deppman, A; Perez, G; Guzman, F; de Camargo, S P

    1999-10-01

    A new and simple statistical procedure (STATFLUX) for the calculation of transfer coefficients of radionuclide transport to animals and plants is proposed. The method is based on the general multiple-compartment model, which uses a system of linear equations involving geometrical volume considerations. By using experimentally available curves of radionuclide concentrations versus time, for each animal compartment (organs), flow parameters were estimated by employing a least-squares procedure, whose consistency is tested. Some numerical results are presented in order to compare the STATFLUX transfer coefficients with those from other works and experimental data.

  14. Development of a funding, cost, and spending model for satellite projects

    NASA Technical Reports Server (NTRS)

    Johnson, Jesse P.

    1989-01-01

    The need for a predictive budget/funging model is obvious. The current models used by the Resource Analysis Office (RAO) are used to predict the total costs of satellite projects. An effort to extend the modeling capabilities from total budget analysis to total budget and budget outlays over time analysis was conducted. A statistical based and data driven methodology was used to derive and develop the model. Th budget data for the last 18 GSFC-sponsored satellite projects were analyzed and used to build a funding model which would describe the historical spending patterns. This raw data consisted of dollars spent in that specific year and their 1989 dollar equivalent. This data was converted to the standard format used by the RAO group and placed in a database. A simple statistical analysis was performed to calculate the gross statistics associated with project length and project cost ant the conditional statistics on project length and project cost. The modeling approach used is derived form the theory of embedded statistics which states that properly analyzed data will produce the underlying generating function. The process of funding large scale projects over extended periods of time is described by Life Cycle Cost Models (LCCM). The data was analyzed to find a model in the generic form of a LCCM. The model developed is based on a Weibull function whose parameters are found by both nonlinear optimization and nonlinear regression. In order to use this model it is necessary to transform the problem from a dollar/time space to a percentage of total budget/time space. This transformation is equivalent to moving to a probability space. By using the basic rules of probability, the validity of both the optimization and the regression steps are insured. This statistically significant model is then integrated and inverted. The resulting output represents a project schedule which relates the amount of money spent to the percentage of project completion.

  15. An algebraic method for constructing stable and consistent autoregressive filters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harlim, John, E-mail: jharlim@psu.edu; Department of Meteorology, the Pennsylvania State University, University Park, PA 16802; Hong, Hoon, E-mail: hong@ncsu.edu

    2015-02-15

    In this paper, we introduce an algebraic method to construct stable and consistent univariate autoregressive (AR) models of low order for filtering and predicting nonlinear turbulent signals with memory depth. By stable, we refer to the classical stability condition for the AR model. By consistent, we refer to the classical consistency constraints of Adams–Bashforth methods of order-two. One attractive feature of this algebraic method is that the model parameters can be obtained without directly knowing any training data set as opposed to many standard, regression-based parameterization methods. It takes only long-time average statistics as inputs. The proposed method provides amore » discretization time step interval which guarantees the existence of stable and consistent AR model and simultaneously produces the parameters for the AR models. In our numerical examples with two chaotic time series with different characteristics of decaying time scales, we find that the proposed AR models produce significantly more accurate short-term predictive skill and comparable filtering skill relative to the linear regression-based AR models. These encouraging results are robust across wide ranges of discretization times, observation times, and observation noise variances. Finally, we also find that the proposed model produces an improved short-time prediction relative to the linear regression-based AR-models in forecasting a data set that characterizes the variability of the Madden–Julian Oscillation, a dominant tropical atmospheric wave pattern.« less

  16. A new statistical method for characterizing the atmospheres of extrasolar planets

    NASA Astrophysics Data System (ADS)

    Henderson, Cassandra S.; Skemer, Andrew J.; Morley, Caroline V.; Fortney, Jonathan J.

    2017-10-01

    By detecting light from extrasolar planets, we can measure their compositions and bulk physical properties. The technologies used to make these measurements are still in their infancy, and a lack of self-consistency suggests that previous observations have underestimated their systemic errors. We demonstrate a statistical method, newly applied to exoplanet characterization, which uses a Bayesian formalism to account for underestimated errorbars. We use this method to compare photometry of a substellar companion, GJ 758b, with custom atmospheric models. Our method produces a probability distribution of atmospheric model parameters including temperature, gravity, cloud model (fsed) and chemical abundance for GJ 758b. This distribution is less sensitive to highly variant data and appropriately reflects a greater uncertainty on parameter fits.

  17. Linking Statistically- and Physically-Based Models for Improved Streamflow Simulation in Gaged and Ungaged Areas

    NASA Astrophysics Data System (ADS)

    Lafontaine, J.; Hay, L.; Archfield, S. A.; Farmer, W. H.; Kiang, J. E.

    2014-12-01

    The U.S. Geological Survey (USGS) has developed a National Hydrologic Model (NHM) to support coordinated, comprehensive and consistent hydrologic model development, and facilitate the application of hydrologic simulations within the continental US. The portion of the NHM located within the Gulf Coastal Plains and Ozarks Landscape Conservation Cooperative (GCPO LCC) is being used to test the feasibility of improving streamflow simulations in gaged and ungaged watersheds by linking statistically- and physically-based hydrologic models. The GCPO LCC covers part or all of 12 states and 5 sub-geographies, totaling approximately 726,000 km2, and is centered on the lower Mississippi Alluvial Valley. A total of 346 USGS streamgages in the GCPO LCC region were selected to evaluate the performance of this new calibration methodology for the period 1980 to 2013. Initially, the physically-based models are calibrated to measured streamflow data to provide a baseline for comparison. An enhanced calibration procedure then is used to calibrate the physically-based models in the gaged and ungaged areas of the GCPO LCC using statistically-based estimates of streamflow. For this application, the calibration procedure is adjusted to address the limitations of the statistically generated time series to reproduce measured streamflow in gaged basins, primarily by incorporating error and bias estimates. As part of this effort, estimates of uncertainty in the model simulations are also computed for the gaged and ungaged watersheds.

  18. A statistical parts-based appearance model of inter-subject variability.

    PubMed

    Toews, Matthew; Collins, D Louis; Arbel, Tal

    2006-01-01

    In this article, we present a general statistical parts-based model for representing the appearance of an image set, applied to the problem of inter-subject MR brain image matching. In contrast with global image representations such as active appearance models, the parts-based model consists of a collection of localized image parts whose appearance, geometry and occurrence frequency are quantified statistically. The parts-based approach explicitly addresses the case where one-to-one correspondence does not exist between subjects due to anatomical differences, as parts are not expected to occur in all subjects. The model can be learned automatically, discovering structures that appear with statistical regularity in a large set of subject images, and can be robustly fit to new images, all in the presence of significant inter-subject variability. As parts are derived from generic scale-invariant features, the framework can be applied in a wide variety of image contexts, in order to study the commonality of anatomical parts or to group subjects according to the parts they share. Experimentation shows that a parts-based model can be learned from a large set of MR brain images, and used to determine parts that are common within the group of subjects. Preliminary results indicate that the model can be used to automatically identify distinctive features for inter-subject image registration despite large changes in appearance.

  19. Application of Semiparametric Spline Regression Model in Analyzing Factors that In uence Population Density in Central Java

    NASA Astrophysics Data System (ADS)

    Sumantari, Y. D.; Slamet, I.; Sugiyanto

    2017-06-01

    Semiparametric regression is a statistical analysis method that consists of parametric and nonparametric regression. There are various approach techniques in nonparametric regression. One of the approach techniques is spline. Central Java is one of the most densely populated province in Indonesia. Population density in this province can be modeled by semiparametric regression because it consists of parametric and nonparametric component. Therefore, the purpose of this paper is to determine the factors that in uence population density in Central Java using the semiparametric spline regression model. The result shows that the factors which in uence population density in Central Java is Family Planning (FP) active participants and district minimum wage.

  20. Does transport time help explain the high trauma mortality rates in rural areas? New and traditional predictors assessed by new and traditional statistical methods

    PubMed Central

    Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas

    2015-01-01

    Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600

  1. A new in silico classification model for ready biodegradability, based on molecular fragments.

    PubMed

    Lombardo, Anna; Pizzo, Fabiola; Benfenati, Emilio; Manganaro, Alberto; Ferrari, Thomas; Gini, Giuseppina

    2014-08-01

    Regulations such as the European REACH (Registration, Evaluation, Authorization and restriction of Chemicals) often require chemicals to be evaluated for ready biodegradability, to assess the potential risk for environmental and human health. Because not all chemicals can be tested, there is an increasing demand for tools for quick and inexpensive biodegradability screening, such as computer-based (in silico) theoretical models. We developed an in silico model starting from a dataset of 728 chemicals with ready biodegradability data (MITI-test Ministry of International Trade and Industry). We used the novel software SARpy to automatically extract, through a structural fragmentation process, a set of substructures statistically related to ready biodegradability. Then, we analysed these substructures in order to build some general rules. The model consists of a rule-set made up of the combination of the statistically relevant fragments and of the expert-based rules. The model gives good statistical performance with 92%, 82% and 76% accuracy on the training, test and external set respectively. These results are comparable with other in silico models like BIOWIN developed by the United States Environmental Protection Agency (EPA); moreover this new model includes an easily understandable explanation. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Neocortical dynamics at multiple scales: EEG standing waves, statistical mechanics, and physical analogs.

    PubMed

    Ingber, Lester; Nunez, Paul L

    2011-02-01

    The dynamic behavior of scalp potentials (EEG) is apparently due to some combination of global and local processes with important top-down and bottom-up interactions across spatial scales. In treating global mechanisms, we stress the importance of myelinated axon propagation delays and periodic boundary conditions in the cortical-white matter system, which is topologically close to a spherical shell. By contrast, the proposed local mechanisms are multiscale interactions between cortical columns via short-ranged non-myelinated fibers. A mechanical model consisting of a stretched string with attached nonlinear springs demonstrates the general idea. The string produces standing waves analogous to large-scale coherent EEG observed in some brain states. The attached springs are analogous to the smaller (mesoscopic) scale columnar dynamics. Generally, we expect string displacement and EEG at all scales to result from both global and local phenomena. A statistical mechanics of neocortical interactions (SMNI) calculates oscillatory behavior consistent with typical EEG, within columns, between neighboring columns via short-ranged non-myelinated fibers, across cortical regions via myelinated fibers, and also derives a string equation consistent with the global EEG model. Copyright © 2010 Elsevier Inc. All rights reserved.

  3. Estimating urban ground-level PM10 using MODIS 3km AOD product and meteorological parameters from WRF model

    NASA Astrophysics Data System (ADS)

    Ghotbi, Saba; Sotoudeheian, Saeed; Arhami, Mohammad

    2016-09-01

    Satellite remote sensing products of AOD from MODIS along with appropriate meteorological parameters were used to develop statistical models and estimate ground-level PM10. Most of previous studies obtained meteorological data from synoptic weather stations, with rather sparse spatial distribution, and used it along with 10 km AOD product to develop statistical models, applicable for PM variations in regional scale (resolution of ≥10 km). In the current study, meteorological parameters were simulated with 3 km resolution using WRF model and used along with the rather new 3 km AOD product (launched in 2014). The resulting PM statistical models were assessed for a polluted and largely variable urban area, Tehran, Iran. Despite the critical particulate pollution problem, very few PM studies were conducted in this area. The issue of rather poor direct PM-AOD associations existed, due to different factors such as variations in particles optical properties, in addition to bright background issue for satellite data, as the studied area located in the semi-arid areas of Middle East. Statistical approach of linear mixed effect (LME) was used, and three types of statistical models including single variable LME model (using AOD as independent variable) and multiple variables LME model by using meteorological data from two sources, WRF model and synoptic stations, were examined. Meteorological simulations were performed using a multiscale approach and creating an appropriate physic for the studied region, and the results showed rather good agreements with recordings of the synoptic stations. The single variable LME model was able to explain about 61%-73% of daily PM10 variations, reflecting a rather acceptable performance. Statistical models performance improved through using multivariable LME and incorporating meteorological data as auxiliary variables, particularly by using fine resolution outputs from WRF (R2 = 0.73-0.81). In addition, rather fine resolution for PM estimates was mapped for the studied city, and resulting concentration maps were consistent with PM recordings at the existing stations.

  4. The effects of modeling instruction on high school physics academic achievement

    NASA Astrophysics Data System (ADS)

    Wright, Tiffanie L.

    The purpose of this study was to explore whether Modeling Instruction, compared to traditional lecturing, is an effective instructional method to promote academic achievement in selected high school physics classes at a rural middle Tennessee high school. This study used an ex post facto , quasi-experimental research methodology. The independent variables in this study were the instructional methods of teaching. The treatment variable was Modeling Instruction and the control variable was traditional lecture instruction. The Treatment Group consisted of participants in Physical World Concepts who received Modeling Instruction. The Control Group consisted of participants in Physical Science who received traditional lecture instruction. The dependent variable was gains scores on the Force Concepts Inventory (FCI). The participants for this study were 133 students each in both the Treatment and Control Groups (n = 266), who attended a public, high school in rural middle Tennessee. The participants were administered the Force Concepts Inventory (FCI) prior to being taught the mechanics of physics. The FCI data were entered into the computer-based Statistical Package for the Social Science (SPSS). Two independent samples t-tests were conducted to answer the research questions. There was a statistically significant difference between the treatment and control groups concerning the instructional method. Modeling Instructional methods were found to be effective in increasing the academic achievement of students in high school physics. There was no statistically significant difference between FCI gains scores for gender. Gender was found to have no effect on the academic achievement of students in high school physics classes. However, even though there was not a statistically significant difference, female students' gains scores were higher than male students' gains scores when Modeling Instructional methods of teaching were used. Based on these findings, it is recommended that high school science teachers should use Modeling Instructional methods of teaching daily in their classrooms. A recommendation for further research is to expand the Modeling Instructional methods of teaching into different content areas, (i.e., reading and language arts) to explore academic achievement gains.

  5. Regionalisation of statistical model outputs creating gridded data sets for Germany

    NASA Astrophysics Data System (ADS)

    Höpp, Simona Andrea; Rauthe, Monika; Deutschländer, Thomas

    2016-04-01

    The goal of the German research program ReKliEs-De (regional climate projection ensembles for Germany, http://.reklies.hlug.de) is to distribute robust information about the range and the extremes of future climate for Germany and its neighbouring river catchment areas. This joint research project is supported by the German Federal Ministry of Education and Research (BMBF) and was initiated by the German Federal States. The Project results are meant to support the development of adaptation strategies to mitigate the impacts of future climate change. The aim of our part of the project is to adapt and transfer the regionalisation methods of the gridded hydrological data set (HYRAS) from daily station data to the station based statistical regional climate model output of WETTREG (regionalisation method based on weather patterns). The WETTREG model output covers the period of 1951 to 2100 with a daily temporal resolution. For this, we generate a gridded data set of the WETTREG output for precipitation, air temperature and relative humidity with a spatial resolution of 12.5 km x 12.5 km, which is common for regional climate models. Thus, this regionalisation allows comparing statistical to dynamical climate model outputs. The HYRAS data set was developed by the German Meteorological Service within the German research program KLIWAS (www.kliwas.de) and consists of daily gridded data for Germany and its neighbouring river catchment areas. It has a spatial resolution of 5 km x 5 km for the entire domain for the hydro-meteorological elements precipitation, air temperature and relative humidity and covers the period of 1951 to 2006. After conservative remapping the HYRAS data set is also convenient for the validation of climate models. The presentation will consist of two parts to present the actual state of the adaptation of the HYRAS regionalisation methods to the statistical regional climate model WETTREG: First, an overview of the HYRAS data set and the regionalisation methods for precipitation (REGNIE method based on a combination of multiple linear regression with 5 predictors and inverse distance weighting), air temperature and relative humidity (optimal interpolation) will be given. Finally, results of the regionalisation of WETTREG model output will be shown.

  6. Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions.

    PubMed

    Canary, Jana D; Blizzard, Leigh; Barry, Ronald P; Hosmer, David W; Quinn, Stephen J

    2016-05-01

    Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (TG), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2) ) statistics can be applied directly. In a simulation study, TG, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The TG statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, TG had more power than HL or J(2) . © 2015 John Wiley & Sons Ltd/London School of Economics.

  7. Revised Perturbation Statistics for the Global Scale Atmospheric Model

    NASA Technical Reports Server (NTRS)

    Justus, C. G.; Woodrum, A.

    1975-01-01

    Magnitudes and scales of atmospheric perturbations about the monthly mean for the thermodynamic variables and wind components are presented by month at various latitudes. These perturbation statistics are a revision of the random perturbation data required for the global scale atmospheric model program and are from meteorological rocket network statistical summaries in the 22 to 65 km height range and NASA grenade and pitot tube data summaries in the region up to 90 km. The observed perturbations in the thermodynamic variables were adjusted to make them consistent with constraints required by the perfect gas law and the hydrostatic equation. Vertical scales were evaluated by Buell's depth of pressure system equation and from vertical structure function analysis. Tables of magnitudes and vertical scales are presented for each month at latitude 10, 30, 50, 70, and 90 degrees.

  8. Detecting Latent Heterogeneity

    ERIC Educational Resources Information Center

    Pearl, Judea

    2017-01-01

    We address the task of determining, from statistical averages alone, whether a population under study consists of several subpopulations, unknown to the investigator, each responding to a given treatment markedly differently. We show that such determination is feasible in three cases: (1) randomized trials with binary treatments, (2) models where…

  9. Potential Costs of Veterans’ Health Care

    DTIC Science & Technology

    2010-10-01

    coverage, there is no rigid mathematical relationship among those proportions because veterans enrolled in Part A may choose to enroll in either Part B or...assumption is consistent with the statistical analysis by an actuarial firm with which VA contracted when developing its model for projecting

  10. Observation of an Exotic Baryon with S=+1 in Photoproduction from the Proton

    NASA Astrophysics Data System (ADS)

    Kubarovsky, V.; Guo, L.; Weygand, D. P.; Stoler, P.; Battaglieri, M.; Devita, R.; Adams, G.; Li, Ji; Nozar, M.; Salgado, C.; Ambrozewicz, P.; Anciant, E.; Anghinolfi, M.; Asavapibhop, B.; Audit, G.; Auger, T.; Avakian, H.; Bagdasaryan, H.; Ball, J. P.; Barrow, S.; Beard, K.; Bektasoglu, M.; Bellis, M.; Benmouna, N.; Berman, B. L.; Bianchi, N.; Biselli, A. S.; Boiarinov, S.; Bouchigny, S.; Bradford, R.; Branford, D.; Briscoe, W. J.; Brooks, W. K.; Burkert, V. D.; Butuceanu, C.; Calarco, J. R.; Carman, D. S.; Carnahan, B.; Cetina, C.; Chen, S.; Ciciani, L.; Cole, P. L.; Connelly, J.; Cords, D.; Corvisiero, P.; Crabb, D.; Crannell, H.; Cummings, J. P.; de Sanctis, E.; Degtyarenko, P. V.; Denizli, H.; Dennis, L.; Dharmawardane, K. V.; Djalali, C.; Dodge, G. E.; Doughty, D.; Dragovitsch, P.; Dugger, M.; Dytman, S.; Dzyubak, O. P.; Egiyan, H.; Egiyan, K. S.; Elouadrhiri, L.; Empl, A.; Eugenio, P.; Farhi, L.; Fatemi, R.; Feuerbach, R. J.; Ficenec, J.; Forest, T. A.; Frolov, V.; Funsten, H.; Gaff, S. J.; Garçon, M.; Gavalian, G.; Gilfoyle, G. P.; Giovanetti, K. L.; Girard, P.; Gothe, R.; Gordon, C. I.; Griffioen, K.; Guidal, M.; Guillo, M.; Gyurjyan, V.; Hadjidakis, C.; Hakobyan, R. S.; Hancock, D.; Hardie, J.; Heddle, D.; Heimberg, P.; Hersman, F. W.; Hicks, K.; Holtrop, M.; Hu, J.; Ilieva, Y.; Ito, M. M.; Jenkins, D.; Joo, K.; Juengst, H. G.; Kelley, J. H.; Khandaker, M.; Kim, K. Y.; Kim, K.; Kim, W.; Klein, F. J.; Klimenko, A. V.; Klusman, M.; Kossov, M.; Kramer, L. H.; Kuhn, S. E.; Kuhn, J.; Lachniet, J.; Laget, J. M.; Langheinrich, J.; Lawrence, D.; Longhi, A.; Lukashin, K.; Major, R. W.; Manak, J. J.; Marchand, C.; McAleer, S.; McNabb, J. W.; Mecking, B. A.; Mehrabyan, S.; Melone, J. J.; Mestayer, M. D.; Meyer, C. A.; Mikhailov, K.; Minehart, R.; Mirazita, M.; Miskimen, R.; Mokeev, V.; Morand, L.; Morrow, S. A.; Mozer, M. U.; Muccifora, V.; Mueller, J.; Mutchler, G. S.; Napolitano, J.; Nasseripour, R.; Nelson, S. O.; Niccolai, S.; Niculescu, G.; Niculescu, I.; Niczyporuk, B. B.; Niyazov, R. A.; O'Brien, J. T.; O'Rielly, G. V.; Opper, A. K.; Osipenko, M.; Park, K.; Pasyuk, E.; Peterson, G.; Philips, S. A.; Pivnyuk, N.; Pocanic, D.; Pogorelko, O.; Polli, E.; Pozdniakov, S.; Preedom, B. M.; Price, J. W.; Prok, Y.; Protopopescu, D.; Qin, L. M.; Raue, B. A.; Riccardi, G.; Ripani, M.; Ritchie, B. G.; Ronchetti, F.; Rossi, P.; Rowntree, D.; Rubin, P. D.; Sabatié, F.; Sabourov, K.; Santoro, J. P.; Sapunenko, V.; Sargsyan, M.; Schumacher, R. A.; Serov, V. S.; Shafi, A.; Sharabian, Y. G.; Shaw, J.; Simionatto, S.; Skabelin, A. V.; Smith, E. S.; Smith, T.; Smith, L. C.; Sober, D. I.; Spraker, M.; Stavinsky, A.; Stepanyan, S.; Strakovsky, I. I.; Strauch, S.; Taiuti, M.; Taylor, S.; Tedeschi, D. J.; Thoma, U.; Thompson, R.; Todor, L.; Tur, C.; Ungaro, M.; Vineyard, M. F.; Vlassov, A. V.; Wang, K.; Weinstein, L. B.; Weisberg, A.; Whisnant, C. S.; Wolin, E.; Wood, M. H.; Yegneswaran, A.; Yun, J.

    2004-01-01

    The reaction γp→π+K-K+n was studied at Jefferson Laboratory using a tagged photon beam with an energy range of 3 5.47GeV. A narrow baryon state with strangeness S=+1 and mass M=1555±10 MeV/c2 was observed in the nK+ invariant mass spectrum. The peak’s width is consistent with the CLAS resolution (FWHM=26 MeV/c2), and its statistical significance is (7.8±1.0)σ. A baryon with positive strangeness has exotic structure and cannot be described in the framework of the naive constituent quark model. The mass of the observed state is consistent with the mass predicted by the chiral soliton model for the Θ+ baryon. In addition, the pK+ invariant mass distribution was analyzed in the reaction γp→K-K+p with high statistics in search of doubly charged exotic baryon states. No resonance structures were found in this spectrum.

  11. Efficient bootstrap estimates for tail statistics

    NASA Astrophysics Data System (ADS)

    Breivik, Øyvind; Aarnes, Ole Johan

    2017-03-01

    Bootstrap resamples can be used to investigate the tail of empirical distributions as well as return value estimates from the extremal behaviour of the sample. Specifically, the confidence intervals on return value estimates or bounds on in-sample tail statistics can be obtained using bootstrap techniques. However, non-parametric bootstrapping from the entire sample is expensive. It is shown here that it suffices to bootstrap from a small subset consisting of the highest entries in the sequence to make estimates that are essentially identical to bootstraps from the entire sample. Similarly, bootstrap estimates of confidence intervals of threshold return estimates are found to be well approximated by using a subset consisting of the highest entries. This has practical consequences in fields such as meteorology, oceanography and hydrology where return values are calculated from very large gridded model integrations spanning decades at high temporal resolution or from large ensembles of independent and identically distributed model fields. In such cases the computational savings are substantial.

  12. Constraints on the near-Earth asteroid obliquity distribution from the Yarkovsky effect

    NASA Astrophysics Data System (ADS)

    Tardioli, C.; Farnocchia, D.; Rozitis, B.; Cotto-Figueroa, D.; Chesley, S. R.; Statler, T. S.; Vasile, M.

    2017-12-01

    Aims: From light curve and radar data we know the spin axis of only 43 near-Earth asteroids. In this paper we attempt to constrain the spin axis obliquity distribution of near-Earth asteroids by leveraging the Yarkovsky effect and its dependence on an asteroid's obliquity. Methods: By modeling the physical parameters driving the Yarkovsky effect, we solve an inverse problem where we test different simple parametric obliquity distributions. Each distribution results in a predicted Yarkovsky effect distribution that we compare with a χ2 test to a dataset of 125 Yarkovsky estimates. Results: We find different obliquity distributions that are statistically satisfactory. In particular, among the considered models, the best-fit solution is a quadratic function, which only depends on two parameters, favors extreme obliquities consistent with the expected outcomes from the YORP effect, has a 2:1 ratio between retrograde and direct rotators, which is in agreement with theoretical predictions, and is statistically consistent with the distribution of known spin axes of near-Earth asteroids.

  13. Brain tissues volume measurements from 2D MRI using parametric approach

    NASA Astrophysics Data System (ADS)

    L'vov, A. A.; Toropova, O. A.; Litovka, Yu. V.

    2018-04-01

    The purpose of the paper is to propose a fully automated method of volume assessment of structures within human brain. Our statistical approach uses maximum interdependency principle for decision making process of measurements consistency and unequal observations. Detecting outliers performed using maximum normalized residual test. We propose a statistical model which utilizes knowledge of tissues distribution in human brain and applies partial data restoration for precision improvement. The approach proposes completed computationally efficient and independent from segmentation algorithm used in the application.

  14. Risk estimation using probability machines

    PubMed Central

    2014-01-01

    Background Logistic regression has been the de facto, and often the only, model used in the description and analysis of relationships between a binary outcome and observed features. It is widely used to obtain the conditional probabilities of the outcome given predictors, as well as predictor effect size estimates using conditional odds ratios. Results We show how statistical learning machines for binary outcomes, provably consistent for the nonparametric regression problem, can be used to provide both consistent conditional probability estimation and conditional effect size estimates. Effect size estimates from learning machines leverage our understanding of counterfactual arguments central to the interpretation of such estimates. We show that, if the data generating model is logistic, we can recover accurate probability predictions and effect size estimates with nearly the same efficiency as a correct logistic model, both for main effects and interactions. We also propose a method using learning machines to scan for possible interaction effects quickly and efficiently. Simulations using random forest probability machines are presented. Conclusions The models we propose make no assumptions about the data structure, and capture the patterns in the data by just specifying the predictors involved and not any particular model structure. So they do not run the same risks of model mis-specification and the resultant estimation biases as a logistic model. This methodology, which we call a “risk machine”, will share properties from the statistical machine that it is derived from. PMID:24581306

  15. Risk estimation using probability machines.

    PubMed

    Dasgupta, Abhijit; Szymczak, Silke; Moore, Jason H; Bailey-Wilson, Joan E; Malley, James D

    2014-03-01

    Logistic regression has been the de facto, and often the only, model used in the description and analysis of relationships between a binary outcome and observed features. It is widely used to obtain the conditional probabilities of the outcome given predictors, as well as predictor effect size estimates using conditional odds ratios. We show how statistical learning machines for binary outcomes, provably consistent for the nonparametric regression problem, can be used to provide both consistent conditional probability estimation and conditional effect size estimates. Effect size estimates from learning machines leverage our understanding of counterfactual arguments central to the interpretation of such estimates. We show that, if the data generating model is logistic, we can recover accurate probability predictions and effect size estimates with nearly the same efficiency as a correct logistic model, both for main effects and interactions. We also propose a method using learning machines to scan for possible interaction effects quickly and efficiently. Simulations using random forest probability machines are presented. The models we propose make no assumptions about the data structure, and capture the patterns in the data by just specifying the predictors involved and not any particular model structure. So they do not run the same risks of model mis-specification and the resultant estimation biases as a logistic model. This methodology, which we call a "risk machine", will share properties from the statistical machine that it is derived from.

  16. Bright high z SnIa: A challenge for ΛCDM

    NASA Astrophysics Data System (ADS)

    Perivolaropoulos, L.; Shafieloo, A.

    2009-06-01

    It has recently been pointed out by Kowalski et. al. [Astrophys. J. 686, 749 (2008).ASJOAB0004-637X10.1086/589937] that there is “an unexpected brightness of the SnIa data at z>1.” We quantify this statement by constructing a new statistic which is applicable directly on the type Ia supernova (SnIa) distance moduli. This statistic is designed to pick up systematic brightness trends of SnIa data points with respect to a best fit cosmological model at high redshifts. It is based on binning the normalized differences between the SnIa distance moduli and the corresponding best fit values in the context of a specific cosmological model (e.g. ΛCDM). These differences are normalized by the standard errors of the observed distance moduli. We then focus on the highest redshift bin and extend its size toward lower redshifts until the binned normalized difference (BND) changes sign (crosses 0) at a redshift zc (bin size Nc). The bin size Nc of this crossing (the statistical variable) is then compared with the corresponding crossing bin size Nmc for Monte Carlo data realizations based on the best fit model. We find that the crossing bin size Nc obtained from the Union08 and Gold06 data with respect to the best fit ΛCDM model is anomalously large compared to Nmc of the corresponding Monte Carlo data sets obtained from the best fit ΛCDM in each case. In particular, only 2.2% of the Monte Carlo ΛCDM data sets are consistent with the Gold06 value of Nc while the corresponding probability for the Union08 value of Nc is 5.3%. Thus, according to this statistic, the probability that the high redshift brightness bias of the Union08 and Gold06 data sets is realized in the context of a (w0,w1)=(-1,0) model (ΛCDM cosmology) is less than 6%. The corresponding realization probability in the context of a (w0,w1)=(-1.4,2) model is more than 30% for both the Union08 and the Gold06 data sets indicating a much better consistency for this model with respect to the BND statistic.

  17. Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon – the reversal paradox

    PubMed Central

    Tu, Yu-Kang; Gunnell, David; Gilthorpe, Mark S

    2008-01-01

    This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon – the reversal paradox – depending on whether the outcome and explanatory variables are categorical, continuous or a combination of both; this renders the issues and remedies for any one to be similar for all three. Although the three statistical paradoxes occur in different types of variables, they share the same characteristic: the association between two variables can be reversed, diminished, or enhanced when another variable is statistically controlled for. Understanding the concepts and theory behind these paradoxes provides insights into some controversial or contradictory research findings. These paradoxes show that prior knowledge and underlying causal theory play an important role in the statistical modelling of epidemiological data, where incorrect use of statistical models might produce consistent, replicable, yet erroneous results. PMID:18211676

  18. A new statistical approach to climate change detection and attribution

    NASA Astrophysics Data System (ADS)

    Ribes, Aurélien; Zwiers, Francis W.; Azaïs, Jean-Marc; Naveau, Philippe

    2017-01-01

    We propose here a new statistical approach to climate change detection and attribution that is based on additive decomposition and simple hypothesis testing. Most current statistical methods for detection and attribution rely on linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. Climate modelling uncertainty is difficult to take into account with regression based methods and is almost never treated explicitly. As an alternative to this approach, our statistical model is only based on the additivity assumption; the proposed method does not regress observations onto expected response patterns. We introduce estimation and testing procedures based on likelihood maximization, and show that climate modelling uncertainty can easily be accounted for. Some discussion is provided on how to practically estimate the climate modelling uncertainty based on an ensemble of opportunity. Our approach is based on the " models are statistically indistinguishable from the truth" paradigm, where the difference between any given model and the truth has the same distribution as the difference between any pair of models, but other choices might also be considered. The properties of this approach are illustrated and discussed based on synthetic data. Lastly, the method is applied to the linear trend in global mean temperature over the period 1951-2010. Consistent with the last IPCC assessment report, we find that most of the observed warming over this period (+0.65 K) is attributable to anthropogenic forcings (+0.67 ± 0.12 K, 90 % confidence range), with a very limited contribution from natural forcings (-0.01± 0.02 K).

  19. Ballistic and diffusive dynamics in a two-dimensional ideal gas of macroscopic chaotic Faraday waves.

    PubMed

    Welch, Kyle J; Hastings-Hauss, Isaac; Parthasarathy, Raghuveer; Corwin, Eric I

    2014-04-01

    We have constructed a macroscopic driven system of chaotic Faraday waves whose statistical mechanics, we find, are surprisingly simple, mimicking those of a thermal gas. We use real-time tracking of a single floating probe, energy equipartition, and the Stokes-Einstein relation to define and measure a pseudotemperature and diffusion constant and then self-consistently determine a coefficient of viscous friction for a test particle in this pseudothermal gas. Because of its simplicity, this system can serve as a model for direct experimental investigation of nonequilibrium statistical mechanics, much as the ideal gas epitomizes equilibrium statistical mechanics.

  20. User's manual for the Simulated Life Analysis of Vehicle Elements (SLAVE) model

    NASA Technical Reports Server (NTRS)

    Paul, D. D., Jr.

    1972-01-01

    The simulated life analysis of vehicle elements model was designed to perform statistical simulation studies for any constant loss rate. The outputs of the model consist of the total number of stages required, stages successfully completing their lifetime, and average stage flight life. This report contains a complete description of the model. Users' instructions and interpretation of input and output data are presented such that a user with little or no prior programming knowledge can successfully implement the program.

  1. Computer-aided auditing of prescription drug claims.

    PubMed

    Iyengar, Vijay S; Hermiz, Keith B; Natarajan, Ramesh

    2014-09-01

    We describe a methodology for identifying and ranking candidate audit targets from a database of prescription drug claims. The relevant audit targets may include various entities such as prescribers, patients and pharmacies, who exhibit certain statistical behavior indicative of potential fraud and abuse over the prescription claims during a specified period of interest. Our overall approach is consistent with related work in statistical methods for detection of fraud and abuse, but has a relative emphasis on three specific aspects: first, based on the assessment of domain experts, certain focus areas are selected and data elements pertinent to the audit analysis in each focus area are identified; second, specialized statistical models are developed to characterize the normalized baseline behavior in each focus area; and third, statistical hypothesis testing is used to identify entities that diverge significantly from their expected behavior according to the relevant baseline model. The application of this overall methodology to a prescription claims database from a large health plan is considered in detail.

  2. Quantile regression for the statistical analysis of immunological data with many non-detects.

    PubMed

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  3. An Econometric Model for Estimating IQ Scores and Environmental Influences on the Pattern of IQ Scores Over Time.

    ERIC Educational Resources Information Center

    Kadane, Joseph B.; And Others

    This paper offers a preliminary analysis of the effects of a semi-segregated school system on the IQ's of its students. The basic data consist of IQ scores for fourth, sixth, and eighth grades and associated environmental data obtained from their school records. A statistical model is developed to analyze longitudinal data when both process error…

  4. Structural uncertainty of downscaled climate model output in a difficult-to-resolve environment: data sparseness and parameterization error contribution to statistical and dynamical downscaling output in the U.S. Caribbean region

    NASA Astrophysics Data System (ADS)

    Terando, A. J.; Grade, S.; Bowden, J.; Henareh Khalyani, A.; Wootten, A.; Misra, V.; Collazo, J.; Gould, W. A.; Boyles, R.

    2016-12-01

    Sub-tropical island nations may be particularly vulnerable to anthropogenic climate change because of predicted changes in the hydrologic cycle that would lead to significant drying in the future. However, decision makers in these regions have seen their adaptation planning efforts frustrated by the lack of island-resolving climate model information. Recently, two investigations have used statistical and dynamical downscaling techniques to develop climate change projections for the U.S. Caribbean region (Puerto Rico and U.S. Virgin Islands). We compare the results from these two studies with respect to three commonly downscaled CMIP5 global climate models (GCMs). The GCMs were dynamically downscaled at a convective-permitting scale using two different regional climate models. The statistical downscaling approach was conducted at locations with long-term climate observations and then further post-processed using climatologically aided interpolation (yielding two sets of projections). Overall, both approaches face unique challenges. The statistical approach suffers from a lack of observations necessary to constrain the model, particularly at the land-ocean boundary and in complex terrain. The dynamically downscaled model output has a systematic dry bias over the island despite ample availability of moisture in the atmospheric column. Notwithstanding these differences, both approaches are consistent in projecting a drier climate that is driven by the strong global-scale anthropogenic forcing.

  5. Assessing the specificity of posttraumatic stress disorder's dysphoric items within the dysphoria model.

    PubMed

    Armour, Cherie; Shevlin, Mark

    2013-10-01

    The factor structure of posttraumatic stress disorder (PTSD) currently used by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), has received limited support. A four-factor dysphoria model is widely supported. However, the dysphoria factor of this model has been hailed as a nonspecific factor of PTSD. The present study investigated the specificity of the dysphoria factor within the dysphoria model by conducting a confirmatory factor analysis while statistically controlling for the variance attributable to depression. The sample consisted of 429 individuals who met the diagnostic criteria for PTSD in the National Comorbidity Survey. The results concluded that there was no significant attenuation in any of the PTSD items. This finding is pertinent given several proposals for the removal of dysphoric items from the diagnostic criteria set of PTSD in the upcoming DSM-5.

  6. A statistic-thermodynamic model for the DOM degradation in the estuary

    NASA Astrophysics Data System (ADS)

    Zheng, Quanan; Chen, Qin; Zhao, Haihong; Shi, Jiuxin; Cao, Yong; Wang, Dan

    2008-03-01

    This study aims to clarify the role of dissolved salts playing in the degradation process of terrestrial dissolved organic matter (DOM) at a scale of molecular movement. The molecular thermal movement is perpetual motion. In a multi-molecular system, this random motion also causes collision between the molecules. Seawater is a multi-molecular system consisting from water, salt, and terrestrial DOM molecules. This study attributes the DOM degradation in the estuary to the inelastic collision of DOM molecule with charged salt ions. From statistic-thermodynamic theories of molecular collision, the DOM degradation model and the DOM distribution model are derived. The models are validated by the field observations and satellite data. Thus, we conclude that the inelastic collision between the terrestrial DOM molecules and dissolved salt ions in seawater is a decisive dynamic mechanism for rapid loss of terrestrial DOM.

  7. Prevalence of consistent condom use with various types of sex partners and associated factors among money boys in Changsha, China.

    PubMed

    Wang, Lian-Hong; Yan, Jin; Yang, Guo-Li; Long, Shuo; Yu, Yong; Wu, Xi-Lin

    2015-04-01

    Money boys with inconsistent condom use (less than 100% of the time) are at high risk of infection by human immunodeficiency virus (HIV) or sexually transmitted infection (STI), but relatively little research has examined their risk behaviors. We investigated the prevalence of consistent condom use (100% of the time) and associated factors among money boys. A cross-sectional study using a structured questionnaire was conducted among money boys in Changsha, China, between July 2012 and January 2013. Independent variables included socio-demographic data, substance abuse history, work characteristics, and self-reported HIV and STI history. Dependent variables included the consistent condom use with different types of sex partners. Among the participants, 82.4% used condoms consistently with male clients, 80.2% with male sex partners, and 77.1% with female sex partners in the past 3 months. A multiple stepwise logistic regression model identified four statistically significant factors associated with lower likelihoods of consistent condom use with male clients: age group, substance abuse, lack of an "employment" arrangement, and having no HIV test within the prior 6 months. In a similar model, only one factor associated significantly with lower likelihoods of consistent condom use with male sex partners was identified in multiple stepwise logistic regression analyses: having no HIV test within the prior six months. As for female sex partners, two significant variables were statistically significant in the multiple stepwise logistic regression analysis: having no HIV test within the prior 6 months and having STI history. Interventions which are linked with more realistic and acceptable HIV prevention methods are greatly warranted and should increase risk awareness and the behavior of consistent condom use in both commercial and personal relationship. © 2015 International Society for Sexual Medicine.

  8. Sandpile-based model for capturing magnitude distributions and spatiotemporal clustering and separation in regional earthquakes

    NASA Astrophysics Data System (ADS)

    Batac, Rene C.; Paguirigan, Antonino A., Jr.; Tarun, Anjali B.; Longjas, Anthony G.

    2017-04-01

    We propose a cellular automata model for earthquake occurrences patterned after the sandpile model of self-organized criticality (SOC). By incorporating a single parameter describing the probability to target the most susceptible site, the model successfully reproduces the statistical signatures of seismicity. The energy distributions closely follow power-law probability density functions (PDFs) with a scaling exponent of around -1. 6, consistent with the expectations of the Gutenberg-Richter (GR) law, for a wide range of the targeted triggering probability values. Additionally, for targeted triggering probabilities within the range 0.004-0.007, we observe spatiotemporal distributions that show bimodal behavior, which is not observed previously for the original sandpile. For this critical range of values for the probability, model statistics show remarkable comparison with long-period empirical data from earthquakes from different seismogenic regions. The proposed model has key advantages, the foremost of which is the fact that it simultaneously captures the energy, space, and time statistics of earthquakes by just introducing a single parameter, while introducing minimal parameters in the simple rules of the sandpile. We believe that the critical targeting probability parameterizes the memory that is inherently present in earthquake-generating regions.

  9. Reversibility in Quantum Models of Stochastic Processes

    NASA Astrophysics Data System (ADS)

    Gier, David; Crutchfield, James; Mahoney, John; James, Ryan

    Natural phenomena such as time series of neural firing, orientation of layers in crystal stacking and successive measurements in spin-systems are inherently probabilistic. The provably minimal classical models of such stochastic processes are ɛ-machines, which consist of internal states, transition probabilities between states and output values. The topological properties of the ɛ-machine for a given process characterize the structure, memory and patterns of that process. However ɛ-machines are often not ideal because their statistical complexity (Cμ) is demonstrably greater than the excess entropy (E) of the processes they represent. Quantum models (q-machines) of the same processes can do better in that their statistical complexity (Cq) obeys the relation Cμ >= Cq >= E. q-machines can be constructed to consider longer lengths of strings, resulting in greater compression. With code-words of sufficiently long length, the statistical complexity becomes time-symmetric - a feature apparently novel to this quantum representation. This result has ramifications for compression of classical information in quantum computing and quantum communication technology.

  10. Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

    PubMed

    Lin, Feng-Chang; Zhu, Jun

    2012-01-01

    We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.

  11. Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics

    NASA Astrophysics Data System (ADS)

    Lazarus, S. M.; Holman, B. P.; Splitt, M. E.

    2017-12-01

    A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.

  12. Solar granulation and statistical crystallography: A modeling approach using size-shape relations

    NASA Technical Reports Server (NTRS)

    Noever, D. A.

    1994-01-01

    The irregular polygonal pattern of solar granulation is analyzed for size-shape relations using statistical crystallography. In contrast to previous work which has assumed perfectly hexagonal patterns for granulation, more realistic accounting of cell (granule) shapes reveals a broader basis for quantitative analysis. Several features emerge as noteworthy: (1) a linear correlation between number of cell-sides and neighboring shapes (called Aboav-Weaire's law); (2) a linear correlation between both average cell area and perimeter and the number of cell-sides (called Lewis's law and a perimeter law, respectively) and (3) a linear correlation between cell area and squared perimeter (called convolution index). This statistical picture of granulation is consistent with a finding of no correlation in cell shapes beyond nearest neighbors. A comparative calculation between existing model predictions taken from luminosity data and the present analysis shows substantial agreements for cell-size distributions. A model for understanding grain lifetimes is proposed which links convective times to cell shape using crystallographic results.

  13. Statistical Forecasting of Current and Future Circum-Arctic Ground Temperatures and Active Layer Thickness

    NASA Astrophysics Data System (ADS)

    Aalto, J.; Karjalainen, O.; Hjort, J.; Luoto, M.

    2018-05-01

    Mean annual ground temperature (MAGT) and active layer thickness (ALT) are key to understanding the evolution of the ground thermal state across the Arctic under climate change. Here a statistical modeling approach is presented to forecast current and future circum-Arctic MAGT and ALT in relation to climatic and local environmental factors, at spatial scales unreachable with contemporary transient modeling. After deploying an ensemble of multiple statistical techniques, distance-blocked cross validation between observations and predictions suggested excellent and reasonable transferability of the MAGT and ALT models, respectively. The MAGT forecasts indicated currently suitable conditions for permafrost to prevail over an area of 15.1 ± 2.8 × 106 km2. This extent is likely to dramatically contract in the future, as the results showed consistent, but region-specific, changes in ground thermal regime due to climate change. The forecasts provide new opportunities to assess future Arctic changes in ground thermal state and biogeochemical feedback.

  14. Assessment of Multiple Daily Precipitation Statistics in ERA-Interim Driven Med-CORDEX and EURO-CORDEX Experiments Against High Resolution Observations

    NASA Astrophysics Data System (ADS)

    Coppola, E.; Fantini, A.; Raffaele, F.; Torma, C. Z.; Bacer, S.; Giorgi, F.; Ahrens, B.; Dubois, C.; Sanchez, E.; Verdecchia, M.

    2017-12-01

    We assess the statistics of different daily precipitation indices in ensembles of Med-CORDEX and EUROCORDEX experiments at high resolution (grid spacing of ˜0.11° , or RCM11) and medium resolution (grid spacing of ˜0.44° , or RCM44) with regional climate models (RCMs) driven by the ERA-Interim reanalysis of observations for the period 1989-2008. The assessment is carried out by comparison with a set of high resolution observation datasets for 9 European subregions. The statistics analyzed include quantitative metrics for mean precipitation, daily precipitation Probability Density Functions (PDFs), daily precipitation intensity, frequency, 95th percentile and 95th percentile of dry spell length. We assess both an ensemble including all Med-CORDEX and EURO-CORDEX models and one including the Med-CORDEX models alone. For the All Models ensembles, the RCM11 one shows a remarkable performance in reproducing the spatial patterns and seasonal cycle of mean precipitation over all regions, with a consistent and marked improvement compared to the RCM44 ensemble and the ERA-Interim reanalysis. A good consistency with observations by the RCM11 ensemble (and a substantial improvement compared to RCM44 and ERA-Interim) is found also for the daily precipitation PDFs, mean intensity and, to a lesser extent, the 95th percentile. In fact, for some regions the RCM11 ensemble overestimates the occurrence of very high intensity events while for one region the models underestimate the occurrence of the largest extremes. The RCM11 ensemble still shows a general tendency to underestimate the dry day frequency and 95th percentile of dry spell length over wetter regions, with only a marginal improvement compared to the lower resolution models. This indicates that the problem of the excessive production of low precipitation events found in many climate models persists also at relatively high resolutions, at least in wet climate regimes. Concerning the Med-CORDEX model ensembles we find that their performance is of similar quality as that of the all-models over the Mediterranean regions analyzed. Finally, we stress the need of consistent and quality checked fine scale observation datasets for the assessment of RCMs run at increasingly high horizontal resolutions.

  15. Estimating Preferential Flow in Karstic Aquifers Using Statistical Mixed Models

    PubMed Central

    Anaya, Angel A.; Padilla, Ingrid; Macchiavelli, Raul; Vesper, Dorothy J.; Meeker, John D.; Alshawabkeh, Akram N.

    2013-01-01

    Karst aquifers are highly productive groundwater systems often associated with conduit flow. These systems can be highly vulnerable to contamination, resulting in a high potential for contaminant exposure to humans and ecosystems. This work develops statistical models to spatially characterize flow and transport patterns in karstified limestone and determines the effect of aquifer flow rates on these patterns. A laboratory-scale Geo-HydroBed model is used to simulate flow and transport processes in a karstic limestone unit. The model consists of stainless-steel tanks containing a karstified limestone block collected from a karst aquifer formation in northern Puerto Rico. Experimental work involves making a series of flow and tracer injections, while monitoring hydraulic and tracer response spatially and temporally. Statistical mixed models are applied to hydraulic data to determine likely pathways of preferential flow in the limestone units. The models indicate a highly heterogeneous system with dominant, flow-dependent preferential flow regions. Results indicate that regions of preferential flow tend to expand at higher groundwater flow rates, suggesting a greater volume of the system being flushed by flowing water at higher rates. Spatial and temporal distribution of tracer concentrations indicates the presence of conduit-like and diffuse flow transport in the system, supporting the notion of both combined transport mechanisms in the limestone unit. The temporal response of tracer concentrations at different locations in the model coincide with, and confirms the preferential flow distribution generated with the statistical mixed models used in the study. PMID:23802921

  16. Antimicrobial susceptibility of Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from a diagnostic veterinary laboratory and recommendations for a surveillance system

    PubMed Central

    Glass-Kaastra, Shiona K.; Pearl, David L.; Reid-Smith, Richard J.; McEwen, Beverly; Slavic, Durda; McEwen, Scott A.; Fairles, Jim

    2014-01-01

    Antimicrobial susceptibility data on Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from Ontario swine (January 1998 to October 2010) were acquired from a comprehensive diagnostic veterinary laboratory in Ontario, Canada. In relation to the possible development of a surveillance system for antimicrobial resistance, data were assessed for ease of management, completeness, consistency, and applicability for temporal and spatial statistical analyses. Limited farm location data precluded spatial analyses and missing demographic data limited their use as predictors within multivariable statistical models. Changes in the standard panel of antimicrobials used for susceptibility testing reduced the number of antimicrobials available for temporal analyses. Data consistency and quality could improve over time in this and similar diagnostic laboratory settings by encouraging complete reporting with sample submission and by modifying database systems to limit free-text data entry. These changes could make more statistical methods available for disease surveillance and cluster detection. PMID:24688133

  17. Antimicrobial susceptibility of Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from a diagnostic veterinary laboratory and recommendations for a surveillance system.

    PubMed

    Glass-Kaastra, Shiona K; Pearl, David L; Reid-Smith, Richard J; McEwen, Beverly; Slavic, Durda; McEwen, Scott A; Fairles, Jim

    2014-04-01

    Antimicrobial susceptibility data on Escherichia coli F4, Pasteurella multocida, and Streptococcus suis isolates from Ontario swine (January 1998 to October 2010) were acquired from a comprehensive diagnostic veterinary laboratory in Ontario, Canada. In relation to the possible development of a surveillance system for antimicrobial resistance, data were assessed for ease of management, completeness, consistency, and applicability for temporal and spatial statistical analyses. Limited farm location data precluded spatial analyses and missing demographic data limited their use as predictors within multivariable statistical models. Changes in the standard panel of antimicrobials used for susceptibility testing reduced the number of antimicrobials available for temporal analyses. Data consistency and quality could improve over time in this and similar diagnostic laboratory settings by encouraging complete reporting with sample submission and by modifying database systems to limit free-text data entry. These changes could make more statistical methods available for disease surveillance and cluster detection.

  18. ON THE FERMI -GBM EVENT 0.4 s AFTER GW150914

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Greiner, J.; Yu, H.-F.; Burgess, J. M.

    In view of the recent report by Connaughton et al., we analyze continuous time-tagged event (TTE) data of Fermi -gamma-ray burst monitor (GBM) around the time of the gravitational-wave event GW 150914. We find that after proper accounting for low-count statistics, the GBM transient event at 0.4 s after GW 150914 is likely not due to an astrophysical source, but consistent with a background fluctuation, removing the tension between the INTEGRAL /ACS non-detection and GBM. Additionally, reanalysis of other short GRBs shows that without proper statistical modeling the fluence of faint events is over-predicted, as verified for some joint GBM–ACSmore » detections of short GRBs. We detail the statistical procedure to correct these biases. As a result, faint short GRBs, verified by ACS detections, with significances in the broadband light curve even smaller than that of the GBM–GW150914 event are recovered as proper non-zero source, while the GBM–GW150914 event is consistent with zero fluence.« less

  19. Can Bose condensation of alpha particles be observed in heavy ion collisions?

    NASA Technical Reports Server (NTRS)

    Tripathi, Ram K.; Townsend, Lawrence W.

    1993-01-01

    Using a fully self-consistent quantum statistical model, we demonstrate the possibility of Bose condensation of alpha particles with a concomitant phase transition in heavy ion collisions. Suggestions for the experimental observation of the signature of the onset of this phenomenon are made.

  20. Local sensitivity analysis for inverse problems solved by singular value decomposition

    USGS Publications Warehouse

    Hill, M.C.; Nolan, B.T.

    2010-01-01

    Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.

  1. Hunting Solomonoff's Swans: Exploring the Boundary Between Physics and Statistics in Hydrological Modeling

    NASA Astrophysics Data System (ADS)

    Nearing, G. S.

    2014-12-01

    Statistical models consistently out-perform conceptual models in the short term, however to account for a nonstationary future (or an unobserved past) scientists prefer to base predictions on unchanging and commutable properties of the universe - i.e., physics. The problem with physically-based hydrology models is, of course, that they aren't really based on physics - they are based on statistical approximations of physical interactions, and we almost uniformly lack an understanding of the entropy associated with these approximations. Thermodynamics is successful precisely because entropy statistics are computable for homogeneous (well-mixed) systems, and ergodic arguments explain the success of Newton's laws to describe systems that are fundamentally quantum in nature. Unfortunately, similar arguments do not hold for systems like watersheds that are heterogeneous at a wide range of scales. Ray Solomonoff formalized the situation in 1968 by showing that given infinite evidence, simultaneously minimizing model complexity and entropy in predictions always leads to the best possible model. The open question in hydrology is about what happens when we don't have infinite evidence - for example, when the future will not look like the past, or when one watershed does not behave like another. How do we isolate stationary and commutable components of watershed behavior? I propose that one possible answer to this dilemma lies in a formal combination of physics and statistics. In this talk I outline my recent analogue (Solomonoff's theorem was digital) of Solomonoff's idea that allows us to quantify the complexity/entropy tradeoff in a way that is intuitive to physical scientists. I show how to formally combine "physical" and statistical methods for model development in a way that allows us to derive the theoretically best possible model given any given physics approximation(s) and available observations. Finally, I apply an analogue of Solomonoff's theorem to evaluate the tradeoff between model complexity and prediction power.

  2. Empirical investigation into depth-resolution of Magnetotelluric data

    NASA Astrophysics Data System (ADS)

    Piana Agostinetti, N.; Ogaya, X.

    2017-12-01

    We investigate the depth-resolution of MT data comparing reconstructed 1D resistivity profiles with measured resistivity and lithostratigraphy from borehole data. Inversion of MT data has been widely used to reconstruct the 1D fine-layered resistivity structure beneath an isolated Magnetotelluric (MT) station. Uncorrelated noise is generally assumed to be associated to MT data. However, wrong assumptions on error statistics have been proved to strongly bias the results obtained in geophysical inversions. In particular the number of resolved layers at depth strongly depends on error statistics. In this study, we applied a trans-dimensional McMC algorithm for reconstructing the 1D resistivity profile near-by the location of a 1500 m-deep borehole, using MT data. We resolve the MT inverse problem imposing different models for the error statistics associated to the MT data. Following a Hierachical Bayes' approach, we also inverted for the hyper-parameters associated to each error statistics model. Preliminary results indicate that assuming un-correlated noise leads to a number of resolved layers larger than expected from the retrieved lithostratigraphy. Moreover, comparing the inversion of synthetic resistivity data obtained from the "true" resistivity stratification measured along the borehole shows that a consistent number of resistivity layers can be obtained using a Gaussian model for the error statistics, with substantial correlation length.

  3. Statistical fluctuations of an ocean surface inferred from shoes and ships

    NASA Astrophysics Data System (ADS)

    Lerche, Ian; Maubeuge, Frédéric

    1995-12-01

    This paper shows that it is possible to roughly estimate some ocean properties using simple time-dependent statistical models of ocean fluctuations. Based on a real incident, the loss by a vessel of a Nike shoes container in the North Pacific Ocean, a statistical model was tested on data sets consisting of the Nike shoes found by beachcombers a few months later. This statistical treatment of the shoes' motion allows one to infer velocity trends of the Pacific Ocean, together with their fluctuation strengths. The idea is to suppose that there is a mean bulk flow speed that can depend on location on the ocean surface and time. The fluctuations of the surface flow speed are then treated as statistically random. The distribution of shoes is described in space and time using Markov probability processes related to the mean and fluctuating ocean properties. The aim of the exercise is to provide some of the properties of the Pacific Ocean that are otherwise calculated using a sophisticated numerical model, OSCURS, where numerous data are needed. Relevant quantities are sharply estimated, which can be useful to (1) constrain output results from OSCURS computations, and (2) elucidate the behavior patterns of ocean flow characteristics on long time scales.

  4. Assessing Videogrammetry for Static Aeroelastic Testing of a Wind-Tunnel Model

    NASA Technical Reports Server (NTRS)

    Spain, Charles V.; Heeg, Jennifer; Ivanco, Thomas G.; Barrows, Danny A.; Florance, James R.; Burner, Alpheus W.; DeMoss, Joshua; Lively, Peter S.

    2004-01-01

    The Videogrammetric Model Deformation (VMD) technique, developed at NASA Langley Research Center, was recently used to measure displacements and local surface angle changes on a static aeroelastic wind-tunnel model. The results were assessed for consistency, accuracy and usefulness. Vertical displacement measurements and surface angular deflections (derived from vertical displacements) taken at no-wind/no-load conditions were analyzed. For accuracy assessment, angular measurements were compared to those from a highly accurate accelerometer. Shewhart's Variables Control Charts were used in the assessment of consistency and uncertainty. Some bad data points were discovered, and it is shown that the measurement results at certain targets were more consistent than at other targets. Physical explanations for this lack of consistency have not been determined. However, overall the measurements were sufficiently accurate to be very useful in monitoring wind-tunnel model aeroelastic deformation and determining flexible stability and control derivatives. After a structural model component failed during a highly loaded condition, analysis of VMD data clearly indicated progressive structural deterioration as the wind-tunnel condition where failure occurred was approached. As a result, subsequent testing successfully incorporated near- real-time monitoring of VMD data in order to ensure structural integrity. The potential for higher levels of consistency and accuracy through the use of statistical quality control practices are discussed and recommended for future applications.

  5. Heterogeneous Structure of Stem Cells Dynamics: Statistical Models and Quantitative Predictions

    PubMed Central

    Bogdan, Paul; Deasy, Bridget M.; Gharaibeh, Burhan; Roehrs, Timo; Marculescu, Radu

    2014-01-01

    Understanding stem cell (SC) population dynamics is essential for developing models that can be used in basic science and medicine, to aid in predicting cells fate. These models can be used as tools e.g. in studying patho-physiological events at the cellular and tissue level, predicting (mal)functions along the developmental course, and personalized regenerative medicine. Using time-lapsed imaging and statistical tools, we show that the dynamics of SC populations involve a heterogeneous structure consisting of multiple sub-population behaviors. Using non-Gaussian statistical approaches, we identify the co-existence of fast and slow dividing subpopulations, and quiescent cells, in stem cells from three species. The mathematical analysis also shows that, instead of developing independently, SCs exhibit a time-dependent fractal behavior as they interact with each other through molecular and tactile signals. These findings suggest that more sophisticated models of SC dynamics should view SC populations as a collective and avoid the simplifying homogeneity assumption by accounting for the presence of more than one dividing sub-population, and their multi-fractal characteristics. PMID:24769917

  6. Using the Johns Hopkins' Aggregated Diagnosis Groups (ADGs) to predict 1-year mortality in population-based cohorts of patients with diabetes in Ontario, Canada.

    PubMed

    Austin, P C; Shah, B R; Newman, A; Anderson, G M

    2012-09-01

    There are limited validated methods to ascertain comorbidities for risk adjustment in ambulatory populations of patients with diabetes using administrative health-care databases. The objective was to examine the ability of the Johns Hopkins' Aggregated Diagnosis Groups to predict mortality in population-based ambulatory samples of both incident and prevalent subjects with diabetes. Retrospective cohorts constructed using population-based administrative data. The incident cohort consisted of all 346,297 subjects diagnosed with diabetes between 1 April 2004 and 31 March 2008. The prevalent cohort consisted of all 879,849 subjects with pre-existing diabetes on 1 January, 2007. The outcome was death within 1 year of the subject's index date. A logistic regression model consisting of age, sex and indicator variables for 22 of the 32 Johns Hopkins' Aggregated Diagnosis Group categories had excellent discrimination for predicting mortality in incident diabetes patients: the c-statistic was 0.87 in an independent validation sample. A similar model had excellent discrimination for predicting mortality in prevalent diabetes patients: the c-statistic was 0.84 in an independent validation sample. Both models demonstrated very good calibration, denoting good agreement between observed and predicted mortality across the range of predicted mortality in which the large majority of subjects lay. For comparative purposes, regression models incorporating the Charlson comorbidity index, age and sex, age and sex, and age alone had poorer discrimination than the model that incorporated the Johns Hopkins' Aggregated Diagnosis Groups. Logistical regression models using age, sex and the John Hopkins' Aggregated Diagnosis Groups were able to accurately predict 1-year mortality in population-based samples of patients with diabetes. © 2011 The Authors. Diabetic Medicine © 2011 Diabetes UK.

  7. Advancing a Model-Validated Statistical Method for Decomposing the Key Oceanic Drivers of Regional Climate: Focus on Northern and Tropical African Climate Variability in the Community Earth System Model (CESM)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Fuyao; Yu, Yan; Notaro, Michael

    This study advances the practicality and stability of the traditional multivariate statistical method, generalized equilibrium feedback assessment (GEFA), for decomposing the key oceanic drivers of regional atmospheric variability, especially when available data records are short. An advanced stepwise GEFA methodology is introduced, in which unimportant forcings within the forcing matrix are eliminated through stepwise selection. Method validation of stepwise GEFA is performed using the CESM, with a focused application to northern and tropical Africa (NTA). First, a statistical assessment of the atmospheric response to each primary oceanic forcing is carried out by applying stepwise GEFA to a fully coupled controlmore » run. Then, a dynamical assessment of the atmospheric response to individual oceanic forcings is performed through ensemble experiments by imposing sea surface temperature anomalies over focal ocean basins. Finally, to quantify the reliability of stepwise GEFA, the statistical assessment is evaluated against the dynamical assessment in terms of four metrics: the percentage of grid cells with consistent response sign, the spatial correlation of atmospheric response patterns, the area-averaged seasonal cycle of response magnitude, and consistency in associated mechanisms between assessments. In CESM, tropical modes, namely El Niño–Southern Oscillation and the tropical Indian Ocean Basin, tropical Indian Ocean dipole, and tropical Atlantic Niño modes, are the dominant oceanic controls of NTA climate. In complementary studies, stepwise GEFA is validated in terms of isolating terrestrial forcings on the atmosphere, and observed oceanic and terrestrial drivers of NTA climate are extracted to establish an observational benchmark for subsequent coupled model evaluation and development of process-based weights for regional climate projections.« less

  8. Advancing a Model-Validated Statistical Method for Decomposing the Key Oceanic Drivers of Regional Climate: Focus on Northern and Tropical African Climate Variability in the Community Earth System Model (CESM)

    DOE PAGES

    Wang, Fuyao; Yu, Yan; Notaro, Michael; ...

    2017-09-27

    This study advances the practicality and stability of the traditional multivariate statistical method, generalized equilibrium feedback assessment (GEFA), for decomposing the key oceanic drivers of regional atmospheric variability, especially when available data records are short. An advanced stepwise GEFA methodology is introduced, in which unimportant forcings within the forcing matrix are eliminated through stepwise selection. Method validation of stepwise GEFA is performed using the CESM, with a focused application to northern and tropical Africa (NTA). First, a statistical assessment of the atmospheric response to each primary oceanic forcing is carried out by applying stepwise GEFA to a fully coupled controlmore » run. Then, a dynamical assessment of the atmospheric response to individual oceanic forcings is performed through ensemble experiments by imposing sea surface temperature anomalies over focal ocean basins. Finally, to quantify the reliability of stepwise GEFA, the statistical assessment is evaluated against the dynamical assessment in terms of four metrics: the percentage of grid cells with consistent response sign, the spatial correlation of atmospheric response patterns, the area-averaged seasonal cycle of response magnitude, and consistency in associated mechanisms between assessments. In CESM, tropical modes, namely El Niño–Southern Oscillation and the tropical Indian Ocean Basin, tropical Indian Ocean dipole, and tropical Atlantic Niño modes, are the dominant oceanic controls of NTA climate. In complementary studies, stepwise GEFA is validated in terms of isolating terrestrial forcings on the atmosphere, and observed oceanic and terrestrial drivers of NTA climate are extracted to establish an observational benchmark for subsequent coupled model evaluation and development of process-based weights for regional climate projections.« less

  9. Integrated data management for clinical studies: automatic transformation of data models with semantic annotations for principal investigators, data managers and statisticians.

    PubMed

    Dugas, Martin; Dugas-Breit, Susanne

    2014-01-01

    Design, execution and analysis of clinical studies involves several stakeholders with different professional backgrounds. Typically, principle investigators are familiar with standard office tools, data managers apply electronic data capture (EDC) systems and statisticians work with statistics software. Case report forms (CRFs) specify the data model of study subjects, evolve over time and consist of hundreds to thousands of data items per study. To avoid erroneous manual transformation work, a converting tool for different representations of study data models was designed. It can convert between office format, EDC and statistics format. In addition, it supports semantic annotations, which enable precise definitions for data items. A reference implementation is available as open source package ODMconverter at http://cran.r-project.org.

  10. Evaluation of neutron total and capture cross sections on 99Tc in the unresolved resonance region

    NASA Astrophysics Data System (ADS)

    Iwamoto, Nobuyuki; Katabuchi, Tatsuya

    2017-09-01

    Long-lived fission product Technetium-99 is one of the most important radioisotopes for nuclear transmutation. The reliable nuclear data are indispensable for a wide energy range up to a few MeV, in order to develop environmental load reducing technology. The statistical analyses of resolved resonances were performed by using the truncated Porter-Thomas distribution, coupled-channels optical model, nuclear level density model and Bayes' theorem on conditional probability. The total and capture cross sections were calculated by a nuclear reaction model code CCONE. The resulting cross sections have statistical consistency between the resolved and unresolved resonance regions. The evaluated capture data reproduce those recently measured at ANNRI of J-PARC/MLF above resolved resonance region up to 800 keV.

  11. Sensitivity analysis of helicopter IMC decelerating steep approach and landing performance to navigation system parameters

    NASA Technical Reports Server (NTRS)

    Karmali, M. S.; Phatak, A. V.

    1982-01-01

    Results of a study to investigate, by means of a computer simulation, the performance sensitivity of helicopter IMC DSAL operations as a function of navigation system parameters are presented. A mathematical model representing generically a navigation system is formulated. The scenario simulated consists of a straight in helicopter approach to landing along a 6 deg glideslope. The deceleration magnitude chosen is 03g. The navigation model parameters are varied and the statistics of the total system errors (TSE) computed. These statistics are used to determine the critical navigation system parameters that affect the performance of the closed-loop navigation, guidance and control system of a UH-1H helicopter.

  12. Getting the big picture in community science: methods that capture context.

    PubMed

    Luke, Douglas A

    2005-06-01

    Community science has a rich tradition of using theories and research designs that are consistent with its core value of contextualism. However, a survey of empirical articles published in the American Journal of Community Psychology shows that community scientists utilize a narrow range of statistical tools that are not well suited to assess contextual data. Multilevel modeling, geographic information systems (GIS), social network analysis, and cluster analysis are recommended as useful tools to address contextual questions in community science. An argument for increased methodological consilience is presented, where community scientists are encouraged to adopt statistical methodology that is capable of modeling a greater proportion of the data than is typical with traditional methods.

  13. Modeling the Test-Retest Statistics of a Localization Experiment in the Full Horizontal Plane.

    PubMed

    Morsnowski, André; Maune, Steffen

    2016-10-01

    Two approaches to model the test-retest statistics of a localization experiment basing on Gaussian distribution and on surrogate data are introduced. Their efficiency is investigated using different measures describing directional hearing ability. A localization experiment in the full horizontal plane is a challenging task for hearing impaired patients. In clinical routine, we use this experiment to evaluate the progress of our cochlear implant (CI) recipients. Listening and time effort limit the reproducibility. The localization experiment consists of a 12 loudspeaker circle, placed in an anechoic room, a "camera silens". In darkness, HSM sentences are presented at 65 dB pseudo-erratically from all 12 directions with five repetitions. This experiment is modeled by a set of Gaussian distributions with different standard deviations added to a perfect estimator, as well as by surrogate data. Five repetitions per direction are used to produce surrogate data distributions for the sensation directions. To investigate the statistics, we retrospectively use the data of 33 CI patients with 92 pairs of test-retest-measurements from the same day. The first model does not take inversions into account, (i.e., permutations of the direction from back to front and vice versa are not considered), although they are common for hearing impaired persons particularly in the rear hemisphere. The second model considers these inversions but does not work with all measures. The introduced models successfully describe test-retest statistics of directional hearing. However, since their applications on the investigated measures perform differently no general recommendation can be provided. The presented test-retest statistics enable pair test comparisons for localization experiments.

  14. Statistical mechanical estimation of the free energy of formation of E. coli biomass for use with macroscopic bioreactor balances.

    PubMed

    Grosz, R; Stephanopoulos, G

    1983-09-01

    The need for the determination of the free energy of formation of biomass in bioreactor second law balances is well established. A statistical mechanical method for the calculation of the free energy of formation of E. coli biomass is introduced. In this method, biomass is modelled to consist of a system of biopolymer networks. The partition function of this system is proposed to consist of acoustic and optical modes of vibration. Acoustic modes are described by Tarasov's model, the parameters of which are evaluated with the aid of low-temperature calorimetric data for the crystalline protein bovine chymotrypsinogen A. The optical modes are described by considering the low-temperature thermodynamic properties of biological monomer crystals such as amino acid crystals. Upper and lower bounds are placed on the entropy to establish the maximum error associated with the statistical method. The upper bound is determined by endowing the monomers in biomass with ideal gas properties. The lower bound is obtained by limiting the monomers to complete immobility. On this basis, the free energy of formation is fixed to within 10%. Proposals are made with regard to experimental verification of the calculated value and extension of the calculation to other types of biomass.

  15. Rock Statistics at the Mars Pathfinder Landing Site, Roughness and Roving on Mars

    NASA Technical Reports Server (NTRS)

    Haldemann, A. F. C.; Bridges, N. T.; Anderson, R. C.; Golombek, M. P.

    1999-01-01

    Several rock counts have been carried out at the Mars Pathfinder landing site producing consistent statistics of rock coverage and size-frequency distributions. These rock statistics provide a primary element of "ground truth" for anchoring remote sensing information used to pick the Pathfinder, and future, landing sites. The observed rock population statistics should also be consistent with the emplacement and alteration processes postulated to govern the landing site landscape. The rock population databases can however be used in ways that go beyond the calculation of cumulative number and cumulative area distributions versus rock diameter and height. Since the spatial parameters measured to characterize each rock are determined with stereo image pairs, the rock database serves as a subset of the full landing site digital terrain model (DTM). Insofar as a rock count can be carried out in a speedier, albeit coarser, manner than the full DTM analysis, rock counting offers several operational and scientific products in the near term. Quantitative rock mapping adds further information to the geomorphic study of the landing site, and can also be used for rover traverse planning. Statistical analysis of the surface roughness using the rock count proxy DTM is sufficiently accurate when compared to the full DTM to compare with radar remote sensing roughness measures, and with rover traverse profiles.

  16. Maintaining Consistency of Spatial Information in the Hippocampal Network: A Combinatorial Geometry Model.

    PubMed

    Dabaghian, Y

    2016-06-01

    Place cells in the rat hippocampus play a key role in creating the animal's internal representation of the world. During active navigation, these cells spike only in discrete locations, together encoding a map of the environment. Electrophysiological recordings have shown that the animal can revisit this map mentally during both sleep and awake states, reactivating the place cells that fired during its exploration in the same sequence in which they were originally activated. Although consistency of place cell activity during active navigation is arguably enforced by sensory and proprioceptive inputs, it remains unclear how a consistent representation of space can be maintained during spontaneous replay. We propose a model that can account for this phenomenon and suggest that a spatially consistent replay requires a number of constraints on the hippocampal network that affect its synaptic architecture and the statistics of synaptic connection strengths.

  17. A hybrid ARIMA and neural network model applied to forecast catch volumes of Selar crumenophthalmus

    NASA Astrophysics Data System (ADS)

    Aquino, Ronald L.; Alcantara, Nialle Loui Mar T.; Addawe, Rizavel C.

    2017-11-01

    The Selar crumenophthalmus with the English name big-eyed scad fish, locally known as matang-baka, is one of the fishes commonly caught along the waters of La Union, Philippines. The study deals with the forecasting of catch volumes of big-eyed scad fish for commercial consumption. The data used are quarterly caught volumes of big-eyed scad fish from 2002 to first quarter of 2017. This actual data is available from the open stat database published by the Philippine Statistics Authority (PSA)whose task is to collect, compiles, analyzes and publish information concerning different aspects of the Philippine setting. Autoregressive Integrated Moving Average (ARIMA) models, Artificial Neural Network (ANN) model and the Hybrid model consisting of ARIMA and ANN were developed to forecast catch volumes of big-eyed scad fish. Statistical errors such as Mean Absolute Errors (MAE) and Root Mean Square Errors (RMSE) were computed and compared to choose the most suitable model for forecasting the catch volume for the next few quarters. A comparison of the results of each model and corresponding statistical errors reveals that the hybrid model, ARIMA-ANN (2,1,2)(6:3:1), is the most suitable model to forecast the catch volumes of the big-eyed scad fish for the next few quarters.

  18. Permutation glass.

    PubMed

    Williams, Mobolaji

    2018-01-01

    The field of disordered systems in statistical physics provides many simple models in which the competing influences of thermal and nonthermal disorder lead to new phases and nontrivial thermal behavior of order parameters. In this paper, we add a model to the subject by considering a disordered system where the state space consists of various orderings of a list. As in spin glasses, the disorder of such "permutation glasses" arises from a parameter in the Hamiltonian being drawn from a distribution of possible values, thus allowing nominally "incorrect orderings" to have lower energies than "correct orderings" in the space of permutations. We analyze a Gaussian, uniform, and symmetric Bernoulli distribution of energy costs, and, by employing Jensen's inequality, derive a simple condition requiring the permutation glass to always transition to the correctly ordered state at a temperature lower than that of the nondisordered system, provided that this correctly ordered state is accessible. We in turn find that in order for the correctly ordered state to be accessible, the probability that an incorrectly ordered component is energetically favored must be less than the inverse of the number of components in the system. We show that all of these results are consistent with a replica symmetric ansatz of the system. We conclude by arguing that there is no distinct permutation glass phase for the simplest model considered here and by discussing how to extend the analysis to more complex Hamiltonians capable of novel phase behavior and replica symmetry breaking. Finally, we outline an apparent correspondence between the presented system and a discrete-energy-level fermion gas. In all, the investigation introduces a class of exactly soluble models into statistical mechanics and provides a fertile ground to investigate statistical models of disorder.

  19. Nonlinear Wave Chaos and the Random Coupling Model

    NASA Astrophysics Data System (ADS)

    Zhou, Min; Ott, Edward; Antonsen, Thomas M.; Anlage, Steven

    The Random Coupling Model (RCM) has been shown to successfully predict the statistical properties of linear wave chaotic cavities in the highly over-moded regime. It is of interest to extend the RCM to strongly nonlinear systems. To introduce nonlinearity, an active nonlinear circuit is connected to two ports of the wave chaotic 1/4-bowtie cavity. The active nonlinear circuit consists of a frequency multiplier, an amplifier and several passive filters. It acts to double the input frequency in the range from 3.5 GHz to 5 GHz, and operates for microwaves going in only one direction. Measurements are taken between two additional ports of the cavity and we measure the statistics of the second harmonic voltage over an ensemble of realizations of the scattering system. We developed an RCM-based model of this system as two chaotic cavities coupled by means of a nonlinear transfer function. The harmonics received at the output are predicted to be the product of three statistical quantities that describe the three elements correspondingly. Statistical results from simulation, RCM-based modeling, and direct experimental measurements will be compared. ONR under Grant No. N000141512134, AFOSR under COE Grant FA9550-15-1-0171,0 and the Maryland Center for Nanophysics and Advanced Materials.

  20. Welfare Reform in California: Early Results from the Impact Analysis.

    ERIC Educational Resources Information Center

    Klerman, Jacob Alex; Hotz, V. Joseph; Reardon, Elaine; Cox, Amy G.; Farley, Donna O.; Haider, Steven J.; Imbens, Guido; Schoeni, Robert

    The impact of California Work Opportunity and Responsibility to Kids (CalWORKS), which was passed to increase California welfare recipients' participation in welfare-to-work (WTW) activities, was examined. The impact study consisted of a nonexperimental program evaluation that used statistical models to estimate causal effects and a simulation…

  1. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  2. Predicting recreational water quality advisories: A comparison of statistical methods

    USGS Publications Warehouse

    Brooks, Wesley R.; Corsi, Steven R.; Fienen, Michael N.; Carvin, Rebecca B.

    2016-01-01

    Epidemiological studies indicate that fecal indicator bacteria (FIB) in beach water are associated with illnesses among people having contact with the water. In order to mitigate public health impacts, many beaches are posted with an advisory when the concentration of FIB exceeds a beach action value. The most commonly used method of measuring FIB concentration takes 18–24 h before returning a result. In order to avoid the 24 h lag, it has become common to ”nowcast” the FIB concentration using statistical regressions on environmental surrogate variables. Most commonly, nowcast models are estimated using ordinary least squares regression, but other regression methods from the statistical and machine learning literature are sometimes used. This study compares 14 regression methods across 7 Wisconsin beaches to identify which consistently produces the most accurate predictions. A random forest model is identified as the most accurate, followed by multiple regression fit using the adaptive LASSO.

  3. A Wave Chaotic Study of Quantum Graphs with Microwave Networks

    NASA Astrophysics Data System (ADS)

    Fu, Ziyuan

    Quantum graphs provide a setting to test the hypothesis that all ray-chaotic systems show universal wave chaotic properties. I study the quantum graphs with a wave chaotic approach. Here, an experimental setup consisting of a microwave coaxial cable network is used to simulate quantum graphs. Some basic features and the distributions of impedance statistics are analyzed from experimental data on an ensemble of tetrahedral networks. The random coupling model (RCM) is applied in an attempt to uncover the universal statistical properties of the system. Deviations from RCM predictions have been observed in that the statistics of diagonal and off-diagonal impedance elements are different. Waves trapped due to multiple reflections on bonds between nodes in the graph most likely cause the deviations from universal behavior in the finite-size realization of a quantum graph. In addition, I have done some investigations on the Random Coupling Model, which are useful for further research.

  4. Consistency errors in p-values reported in Spanish psychology journals.

    PubMed

    Caperos, José Manuel; Pardo, Antonio

    2013-01-01

    Recent reviews have drawn attention to frequent consistency errors when reporting statistical results. We have reviewed the statistical results reported in 186 articles published in four Spanish psychology journals. Of these articles, 102 contained at least one of the statistics selected for our study: Fisher-F , Student-t and Pearson-c 2 . Out of the 1,212 complete statistics reviewed, 12.2% presented a consistency error, meaning that the reported p-value did not correspond to the reported value of the statistic and its degrees of freedom. In 2.3% of the cases, the correct calculation would have led to a different conclusion than the reported one. In terms of articles, 48% included at least one consistency error, and 17.6% would have to change at least one conclusion. In meta-analytical terms, with a focus on effect size, consistency errors can be considered substantial in 9.5% of the cases. These results imply a need to improve the quality and precision with which statistical results are reported in Spanish psychology journals.

  5. Comparing the strength of behavioural plasticity and consistency across situations: animal personalities in the hermit crab Pagurus bernhardus.

    PubMed

    Briffa, Mark; Rundle, Simon D; Fryer, Adam

    2008-06-07

    Many phenotypic traits show plasticity but behaviour is often considered the 'most plastic' aspect of phenotype as it is likely to show the quickest response to temporal changes in conditions or 'situation'. However, it has also been noted that constraints on sensory acuity, cognitive structure and physiological capacities place limits on behavioural plasticity. Such limits to plasticity may generate consistent differences in behaviour between individuals from the same population. It has recently been suggested that these consistent differences in individual behaviour may be adaptive and the term 'animal personalities' has been used to describe them. In many cases, however, a degree of both behavioural plasticity and relative consistency is probable. To understand the possible functions of animal personalities, it is necessary to determine the relative strength of each tendency and this may be achieved by comparison of statistical effect sizes for tests of difference and concordance. Here, we describe a new statistical framework for making such comparisons and investigate cross-situational plasticity and consistency in the duration of startle responses in the European hermit crab Pagurus bernhardus, in the field and the laboratory. The effect sizes of tests for behavioural consistency were greater than for tests of behavioural plasticity, indicating for the first time the presence of animal personalities in a crustacean model.

  6. Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective

    USGS Publications Warehouse

    Barker, Richard J.; Link, William A.

    2015-01-01

    Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.

  7. Alterations in choice behavior by manipulations of world model.

    PubMed

    Green, C S; Benson, C; Kersten, D; Schrater, P

    2010-09-14

    How to compute initially unknown reward values makes up one of the key problems in reinforcement learning theory, with two basic approaches being used. Model-free algorithms rely on the accumulation of substantial amounts of experience to compute the value of actions, whereas in model-based learning, the agent seeks to learn the generative process for outcomes from which the value of actions can be predicted. Here we show that (i) "probability matching"-a consistent example of suboptimal choice behavior seen in humans-occurs in an optimal Bayesian model-based learner using a max decision rule that is initialized with ecologically plausible, but incorrect beliefs about the generative process for outcomes and (ii) human behavior can be strongly and predictably altered by the presence of cues suggestive of various generative processes, despite statistically identical outcome generation. These results suggest human decision making is rational and model based and not consistent with model-free learning.

  8. Alterations in choice behavior by manipulations of world model

    PubMed Central

    Green, C. S.; Benson, C.; Kersten, D.; Schrater, P.

    2010-01-01

    How to compute initially unknown reward values makes up one of the key problems in reinforcement learning theory, with two basic approaches being used. Model-free algorithms rely on the accumulation of substantial amounts of experience to compute the value of actions, whereas in model-based learning, the agent seeks to learn the generative process for outcomes from which the value of actions can be predicted. Here we show that (i) “probability matching”—a consistent example of suboptimal choice behavior seen in humans—occurs in an optimal Bayesian model-based learner using a max decision rule that is initialized with ecologically plausible, but incorrect beliefs about the generative process for outcomes and (ii) human behavior can be strongly and predictably altered by the presence of cues suggestive of various generative processes, despite statistically identical outcome generation. These results suggest human decision making is rational and model based and not consistent with model-free learning. PMID:20805507

  9. Prediction of pilot reserve attention capacity during air-to-air target tracking

    NASA Technical Reports Server (NTRS)

    Onstott, E. D.; Faulkner, W. H.

    1977-01-01

    Reserve attention capacity of a pilot was calculated using a pilot model that allocates exclusive model attention according to the ranking of task urgency functions whose variables are tracking error and error rate. The modeled task consisted of tracking a maneuvering target aircraft both vertically and horizontally, and when possible, performing a diverting side task which was simulated by the precise positioning of an electrical stylus and modeled as a task of constant urgency in the attention allocation algorithm. The urgency of the single loop vertical task is simply the magnitude of the vertical tracking error, while the multiloop horizontal task requires a nonlinear urgency measure of error and error rate terms. Comparison of model results with flight simulation data verified the computed model statistics of tracking error of both axes, lateral and longitudinal stick amplitude and rate, and side task episodes. Full data for the simulation tracking statistics as well as the explicit equations and structure of the urgency function multiaxis pilot model are presented.

  10. A study of finite mixture model: Bayesian approach on financial time series data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-07-01

    Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.

  11. A New Approach of Juvenile Age Estimation using Measurements of the Ilium and Multivariate Adaptive Regression Splines (MARS) Models for Better Age Prediction.

    PubMed

    Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal

    2017-01-01

    Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.

  12. The effect of a major cigarette price change on smoking behavior in california: a zero-inflated negative binomial model.

    PubMed

    Sheu, Mei-Ling; Hu, Teh-Wei; Keeler, Theodore E; Ong, Michael; Sung, Hai-Yen

    2004-08-01

    The objective of this paper is to determine the price sensitivity of smokers in their consumption of cigarettes, using evidence from a major increase in California cigarette prices due to Proposition 10 and the Tobacco Settlement. The study sample consists of individual survey data from Behavioral Risk Factor Survey (BRFS) and price data from the Bureau of Labor Statistics between 1996 and 1999. A zero-inflated negative binomial (ZINB) regression model was applied for the statistical analysis. The statistical model showed that price did not have an effect on reducing the estimated prevalence of smoking. However, it indicated that among smokers the price elasticity was at the level of -0.46 and statistically significant. Since smoking prevalence is significantly lower than it was a decade ago, price increases are becoming less effective as an inducement for hard-core smokers to quit, although they may respond by decreasing consumption. For those who only smoke occasionally (many of them being young adults) price increases alone may not be an effective inducement to quit smoking. Additional underlying behavioral factors need to be identified so that more effective anti-smoking strategies can be developed.

  13. Oscillatory dynamics of investment and capacity utilization

    NASA Astrophysics Data System (ADS)

    Greenblatt, R. E.

    2017-01-01

    Capitalist economic systems display a wide variety of oscillatory phenomena whose underlying causes are often not well understood. In this paper, I consider a very simple model of the reciprocal interaction between investment, capacity utilization, and their time derivatives. The model, which gives rise periodic oscillations, predicts qualitatively the phase relations between these variables. These predictions are observed to be consistent in a statistical sense with econometric data from the US economy.

  14. A terrain-based site characterization map of California with implications for the contiguous United States

    USGS Publications Warehouse

    Yong, Alan K.; Hough, Susan E.; Iwahashi, Junko; Braverman, Amy

    2012-01-01

    We present an approach based on geomorphometry to predict material properties and characterize site conditions using the VS30 parameter (time‐averaged shear‐wave velocity to a depth of 30 m). Our framework consists of an automated terrain classification scheme based on taxonomic criteria (slope gradient, local convexity, and surface texture) that systematically identifies 16 terrain types from 1‐km spatial resolution (30 arcsec) Shuttle Radar Topography Mission digital elevation models (SRTM DEMs). Using 853 VS30 values from California, we apply a simulation‐based statistical method to determine the mean VS30 for each terrain type in California. We then compare the VS30 values with models based on individual proxies, such as mapped surface geology and topographic slope, and show that our systematic terrain‐based approach consistently performs better than semiempirical estimates based on individual proxies. To further evaluate our model, we apply our California‐based estimates to terrains of the contiguous United States. Comparisons of our estimates with 325 VS30 measurements outside of California, as well as estimates based on the topographic slope model, indicate our method to be statistically robust and more accurate. Our approach thus provides an objective and robust method for extending estimates of VS30 for regions where in situ measurements are sparse or not readily available.

  15. Specificity and timescales of cortical adaptation as inferences about natural movie statistics.

    PubMed

    Snow, Michoel; Coen-Cagli, Ruben; Schwartz, Odelia

    2016-10-01

    Adaptation is a phenomenological umbrella term under which a variety of temporal contextual effects are grouped. Previous models have shown that some aspects of visual adaptation reflect optimal processing of dynamic visual inputs, suggesting that adaptation should be tuned to the properties of natural visual inputs. However, the link between natural dynamic inputs and adaptation is poorly understood. Here, we extend a previously developed Bayesian modeling framework for spatial contextual effects to the temporal domain. The model learns temporal statistical regularities of natural movies and links these statistics to adaptation in primary visual cortex via divisive normalization, a ubiquitous neural computation. In particular, the model divisively normalizes the present visual input by the past visual inputs only to the degree that these are inferred to be statistically dependent. We show that this flexible form of normalization reproduces classical findings on how brief adaptation affects neuronal selectivity. Furthermore, prior knowledge acquired by the Bayesian model from natural movies can be modified by prolonged exposure to novel visual stimuli. We show that this updating can explain classical results on contrast adaptation. We also simulate the recent finding that adaptation maintains population homeostasis, namely, a balanced level of activity across a population of neurons with different orientation preferences. Consistent with previous disparate observations, our work further clarifies the influence of stimulus-specific and neuronal-specific normalization signals in adaptation.

  16. Specificity and timescales of cortical adaptation as inferences about natural movie statistics

    PubMed Central

    Snow, Michoel; Coen-Cagli, Ruben; Schwartz, Odelia

    2016-01-01

    Adaptation is a phenomenological umbrella term under which a variety of temporal contextual effects are grouped. Previous models have shown that some aspects of visual adaptation reflect optimal processing of dynamic visual inputs, suggesting that adaptation should be tuned to the properties of natural visual inputs. However, the link between natural dynamic inputs and adaptation is poorly understood. Here, we extend a previously developed Bayesian modeling framework for spatial contextual effects to the temporal domain. The model learns temporal statistical regularities of natural movies and links these statistics to adaptation in primary visual cortex via divisive normalization, a ubiquitous neural computation. In particular, the model divisively normalizes the present visual input by the past visual inputs only to the degree that these are inferred to be statistically dependent. We show that this flexible form of normalization reproduces classical findings on how brief adaptation affects neuronal selectivity. Furthermore, prior knowledge acquired by the Bayesian model from natural movies can be modified by prolonged exposure to novel visual stimuli. We show that this updating can explain classical results on contrast adaptation. We also simulate the recent finding that adaptation maintains population homeostasis, namely, a balanced level of activity across a population of neurons with different orientation preferences. Consistent with previous disparate observations, our work further clarifies the influence of stimulus-specific and neuronal-specific normalization signals in adaptation. PMID:27699416

  17. Logical reasoning versus information processing in the dual-strategy model of reasoning.

    PubMed

    Markovits, Henry; Brisson, Janie; de Chantal, Pier-Luc

    2017-01-01

    One of the major debates concerning the nature of inferential reasoning is between counterexample-based strategies such as mental model theory and statistical strategies underlying probabilistic models. The dual-strategy model, proposed by Verschueren, Schaeken, & d'Ydewalle (2005a, 2005b), which suggests that people might have access to both kinds of strategy has been supported by several recent studies. These have shown that statistical reasoners make inferences based on using information about premises in order to generate a likelihood estimate of conclusion probability. However, while results concerning counterexample reasoners are consistent with a counterexample detection model, these results could equally be interpreted as indicating a greater sensitivity to logical form. In order to distinguish these 2 interpretations, in Studies 1 and 2, we presented reasoners with Modus ponens (MP) inferences with statistical information about premise strength and in Studies 3 and 4, naturalistic MP inferences with premises having many disabling conditions. Statistical reasoners accepted the MP inference more often than counterexample reasoners in Studies 1 and 2, while the opposite pattern was observed in Studies 3 and 4. Results show that these strategies must be defined in terms of information processing, with no clear relations to "logical" reasoning. These results have additional implications for the underlying debate about the nature of human reasoning. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Range of interaction in an opinion evolution model of ideological self-positioning: Contagion, hesitance and polarization

    NASA Astrophysics Data System (ADS)

    Gimenez, M. Cecilia; Paz García, Ana Pamela; Burgos Paci, Maxi A.; Reinaudi, Luis

    2016-04-01

    The evolution of public opinion using tools and concepts borrowed from Statistical Physics is an emerging area within the field of Sociophysics. In the present paper, a Statistical Physics model was developed to study the evolution of the ideological self-positioning of an ensemble of agents. The model consists of an array of L components, each one of which represents the ideology of an agent. The proposed mechanism is based on the ;voter model;, in which one agent can adopt the opinion of another one if the difference of their opinions lies within a certain range. The existence of ;undecided; agents (i.e. agents with no definite opinion) was implemented in the model. The possibility of radicalization of an agent's opinion upon interaction with another one was also implemented. The results of our simulations are compared to statistical data taken from the Latinobarómetro databank for the cases of Argentina, Chile, Brazil and Uruguay in the last decade. Among other results, the effect of taking into account the undecided agents is the formation of a single peak at the middle of the ideological spectrum (which corresponds to a centrist ideological position), in agreement with the real cases studied.

  19. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA

    PubMed Central

    Salter-Townshend, Michael; McCormick, Tyler H.

    2018-01-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)]. PMID:29721127

  20. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA.

    PubMed

    Salter-Townshend, Michael; McCormick, Tyler H

    2017-09-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090-1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

  1. Evaluation of annual, global seismicity forecasts, including ensemble models

    NASA Astrophysics Data System (ADS)

    Taroni, Matteo; Zechar, Jeremy; Marzocchi, Warner

    2013-04-01

    In 2009, the Collaboratory for the Study of the Earthquake Predictability (CSEP) initiated a prototype global earthquake forecast experiment. Three models participated in this experiment for 2009, 2010 and 2011—each model forecast the number of earthquakes above magnitude 6 in 1x1 degree cells that span the globe. Here we use likelihood-based metrics to evaluate the consistency of the forecasts with the observed seismicity. We compare model performance with statistical tests and a new method based on the peer-to-peer gambling score. The results of the comparisons are used to build ensemble models that are a weighted combination of the individual models. Notably, in these experiments the ensemble model always performs significantly better than the single best-performing model. Our results indicate the following: i) time-varying forecasts, if not updated after each major shock, may not provide significant advantages with respect to time-invariant models in 1-year forecast experiments; ii) the spatial distribution seems to be the most important feature to characterize the different forecasting performances of the models; iii) the interpretation of consistency tests may be misleading because some good models may be rejected while trivial models may pass consistency tests; iv) a proper ensemble modeling seems to be a valuable procedure to get the best performing model for practical purposes.

  2. Evaluating statistical cloud schemes: What can we gain from ground-based remote sensing?

    NASA Astrophysics Data System (ADS)

    Grützun, V.; Quaas, J.; Morcrette, C. J.; Ament, F.

    2013-09-01

    Statistical cloud schemes with prognostic probability distribution functions have become more important in atmospheric modeling, especially since they are in principle scale adaptive and capture cloud physics in more detail. While in theory the schemes have a great potential, their accuracy is still questionable. High-resolution three-dimensional observational data of water vapor and cloud water, which could be used for testing them, are missing. We explore the potential of ground-based remote sensing such as lidar, microwave, and radar to evaluate prognostic distribution moments using the "perfect model approach." This means that we employ a high-resolution weather model as virtual reality and retrieve full three-dimensional atmospheric quantities and virtual ground-based observations. We then use statistics from the virtual observation to validate the modeled 3-D statistics. Since the data are entirely consistent, any discrepancy occurring is due to the method. Focusing on total water mixing ratio, we find that the mean ratio can be evaluated decently but that it strongly depends on the meteorological conditions as to whether the variance and skewness are reliable. Using some simple schematic description of different synoptic conditions, we show how statistics obtained from point or line measurements can be poor at representing the full three-dimensional distribution of water in the atmosphere. We argue that a careful analysis of measurement data and detailed knowledge of the meteorological situation is necessary to judge whether we can use the data for an evaluation of higher moments of the humidity distribution used by a statistical cloud scheme.

  3. Identification of crop cultivars with consistently high lignocellulosic sugar release requires the use of appropriate statistical design and modelling

    PubMed Central

    2013-01-01

    Background In this study, a multi-parent population of barley cultivars was grown in the field for two consecutive years and then straw saccharification (sugar release by enzymes) was subsequently analysed in the laboratory to identify the cultivars with the highest consistent sugar yield. This experiment was used to assess the benefit of accounting for both the multi-phase and multi-environment aspects of large-scale phenotyping experiments with field-grown germplasm through sound statistical design and analysis. Results Complementary designs at both the field and laboratory phases of the experiment ensured that non-genetic sources of variation could be separated from the genetic variation of cultivars, which was the main target of the study. The field phase included biological replication and plot randomisation. The laboratory phase employed re-randomisation and technical replication of samples within a batch, with a subset of cultivars chosen as duplicates that were randomly allocated across batches. The resulting data was analysed using a linear mixed model that incorporated field and laboratory variation and a cultivar by trial interaction, and ensured that the cultivar means were more accurately represented than if the non-genetic variation was ignored. The heritability detected was more than doubled in each year of the trial by accounting for the non-genetic variation in the analysis, clearly showing the benefit of this design and approach. Conclusions The importance of accounting for both field and laboratory variation, as well as the cultivar by trial interaction, by fitting a single statistical model (multi-environment trial, MET, model), was evidenced by the changes in list of the top 40 cultivars showing the highest sugar yields. Failure to account for this interaction resulted in only eight cultivars that were consistently in the top 40 in different years. The correspondence between the rankings of cultivars was much higher at 25 in the MET model. This approach is suited to any multi-phase and multi-environment population-based genetic experiment. PMID:24359577

  4. Ordered phase and non-equilibrium fluctuation in stock market

    NASA Astrophysics Data System (ADS)

    Maskawa, Jun-ichi

    2002-08-01

    We analyze the statistics of daily price change of stock market in the framework of a statistical physics model for the collective fluctuation of stock portfolio. In this model the time series of price changes are coded into the sequences of up and down spins, and the Hamiltonian of the system is expressed by spin-spin interactions as in spin glass models of disordered magnetic systems. Through the analysis of Dow-Jones industrial portfolio consisting of 30 stock issues by this model, we find a non-equilibrium fluctuation mode on the point slightly below the boundary between ordered and disordered phases. The remaining 29 modes are still in disordered phase and well described by Gibbs distribution. The variance of the fluctuation is outlined by the theoretical curve and peculiarly large in the non-equilibrium mode compared with those in the other modes remaining in ordinary phase.

  5. Use of High-Resolution Satellite Observations to Evaluate Cloud and Precipitation Statistics from Cloud-Resolving Model Simulations

    NASA Astrophysics Data System (ADS)

    Zhou, Y.; Tao, W.; Hou, A. Y.; Zeng, X.; Shie, C.

    2007-12-01

    The cloud and precipitation statistics simulated by 3D Goddard Cumulus Ensemble (GCE) model for different environmental conditions, i.e., the South China Sea Monsoon Experiment (SCSMEX), CRYSTAL-FACE, and KAWJEX are compared with Tropical Rainfall Measuring Mission (TRMM) TMI and PR rainfall measurements and as well as cloud observations from the Earth's Radiant Energy System (CERES) and the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments. It is found that GCE is capable of simulating major convective system development and reproducing total surface rainfall amount as compared with rainfall estimated from the soundings. The model presents large discrepancies in rain spectrum and vertical hydrometer profiles. The discrepancy in the precipitation field is also consistent with the cloud and radiation observations. The study will focus on the effects of large scale forcing and microphysics to the simulated model- observation discrepancies.

  6. Nonequilibrium critical behavior of model statistical systems and methods for the description of its features

    NASA Astrophysics Data System (ADS)

    Prudnikov, V. V.; Prudnikov, P. V.; Mamonova, M. V.

    2017-11-01

    This paper reviews features in critical behavior of far-from-equilibrium macroscopic systems and presents current methods of describing them by referring to some model statistical systems such as the three-dimensional Ising model and the two-dimensional XY model. The paper examines the critical relaxation of homogeneous and structurally disordered systems subjected to abnormally strong fluctuation effects involved in ordering processes in solids at second-order phase transitions. Interest in such systems is due to the aging properties and fluctuation-dissipation theorem violations predicted for and observed in systems slowly evolving from a nonequilibrium initial state. It is shown that these features of nonequilibrium behavior show up in the magnetic properties of magnetic superstructures consisting of alternating nanoscale-thick magnetic and nonmagnetic layers and can be observed not only near the film’s critical ferromagnetic ordering temperature Tc, but also over the wide temperature range T ⩽ Tc.

  7. Modeling spiking behavior of neurons with time-dependent Poisson processes.

    PubMed

    Shinomoto, S; Tsubo, Y

    2001-10-01

    Three kinds of interval statistics, as represented by the coefficient of variation, the skewness coefficient, and the correlation coefficient of consecutive intervals, are evaluated for three kinds of time-dependent Poisson processes: pulse regulated, sinusoidally regulated, and doubly stochastic. Among these three processes, the sinusoidally regulated and doubly stochastic Poisson processes, in the case when the spike rate varies slowly compared with the mean interval between spikes, are found to be consistent with the three statistical coefficients exhibited by data recorded from neurons in the prefrontal cortex of monkeys.

  8. Overcoming bias in estimating the volume-outcome relationship.

    PubMed

    Tsai, Alexander C; Votruba, Mark; Bridges, John F P; Cebul, Randall D

    2006-02-01

    To examine the effect of hospital volume on 30-day mortality for patients with congestive heart failure (CHF) using administrative and clinical data in conventional regression and instrumental variables (IV) estimation models. The primary data consisted of longitudinal information on comorbid conditions, vital signs, clinical status, and laboratory test results for 21,555 Medicare-insured patients aged 65 years and older hospitalized for CHF in northeast Ohio in 1991-1997. The patient was the primary unit of analysis. We fit a linear probability model to the data to assess the effects of hospital volume on patient mortality within 30 days of admission. Both administrative and clinical data elements were included for risk adjustment. Linear distances between patients and hospitals were used to construct the instrument, which was then used to assess the endogeneity of hospital volume. When only administrative data elements were included in the risk adjustment model, the estimated volume-outcome effect was statistically significant (p=.029) but small in magnitude. The estimate was markedly attenuated in magnitude and statistical significance when clinical data were added to the model as risk adjusters (p=.39). IV estimation shifted the estimate in a direction consistent with selective referral, but we were unable to reject the consistency of the linear probability estimates. Use of only administrative data for volume-outcomes research may generate spurious findings. The IV analysis further suggests that conventional estimates of the volume-outcome relationship may be contaminated by selective referral effects. Taken together, our results suggest that efforts to concentrate hospital-based CHF care in high-volume hospitals may not reduce mortality among elderly patients.

  9. Statistical Mechanics of the US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward D.; Broedersz, Chase P.; Bialek, William

    2015-07-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The maximum entropy model consistent with the observed pairwise correlations among justices' votes, an Ising spin glass, agrees quantitatively with the data. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering the intuition that ideologically opposite justices negatively influence each another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, organizing the voting patterns in a relatively simple "energy landscape." Besides unanimity, other energy minima in this landscape, or maxima in probability, correspond to prototypical voting states, such as the ideological split or a tightly correlated, conservative core. The model correctly predicts the correlation of justices with the majority and gives us a measure of their influence on the majority decision. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context.

  10. Limited-information goodness-of-fit testing of diagnostic classification item response models.

    PubMed

    Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen

    2016-11-01

    Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.

  11. Statistical-dynamical modeling of the cloud-to-ground lightning activity in Portugal

    NASA Astrophysics Data System (ADS)

    Sousa, J. F.; Fragoso, M.; Mendes, S.; Corte-Real, J.; Santos, J. A.

    2013-10-01

    The present study employs a dataset of cloud-to-ground discharges over Portugal, collected by the Portuguese lightning detection network in the period of 2003-2009, to identify dynamically coherent lightning regimes in Portugal and to implement a statistical-dynamical modeling of the daily discharges over the country. For this purpose, the high-resolution MERRA reanalysis is used. Three lightning regimes are then identified for Portugal: WREG, WREM and SREG. WREG is a typical cold-core cut-off low. WREM is connected to strong frontal systems driven by remote low pressure systems at higher latitudes over the North Atlantic. SREG is a combination of an inverted trough and a mid-tropospheric cold-core nearby Portugal. The statistical-dynamical modeling is based on logistic regressions (statistical component) developed for each regime separately (dynamical component). It is shown that the strength of the lightning activity (either strong or weak) for each regime is consistently modeled by a set of suitable dynamical predictors (65-70% of efficiency). The difference of the equivalent potential temperature in the 700-500 hPa layer is the best predictor for the three regimes, while the best 4-layer lifted index is still important for all regimes, but with much weaker significance. Six other predictors are more suitable for a specific regime. For the purpose of validating the modeling approach, a regional-scale climate model simulation is carried out under a very intense lightning episode.

  12. The interpretation of simultaneous soft X-ray spectroscopic and imaging observations of an active region. [in solar corona

    NASA Technical Reports Server (NTRS)

    Davis, J. M.; Gerassimenko, M.; Krieger, A. S.; Vaiana, G. S.

    1975-01-01

    Simultaneous soft X-ray spectroscopic and broad-band imaging observations of an active region have been analyzed together to determine the parameters which describe the coronal plasma. From the spectroscopic data, models of temperature-emission measure-elemental abundance have been constructed which provide acceptable statistical fits. By folding these possible models through the imaging analysis, models which are not self-consistent can be rejected. In this way, only the oxygen, neon, and iron abundances of Pottasch (1967), combined with either an isothermal or exponential temperature-emission-measure model, are consistent with both sets of data. Contour maps of electron temperature and density for the active region have been constructed from the imaging data. The implications of the analysis for the determination of coronal abundances and for future satellite experiments are discussed.

  13. Exploring Explanations of Subglacial Bedform Sizes Using Statistical Models.

    PubMed

    Hillier, John K; Kougioumtzoglou, Ioannis A; Stokes, Chris R; Smith, Michael J; Clark, Chris D; Spagnolo, Matteo S

    2016-01-01

    Sediments beneath modern ice sheets exert a key control on their flow, but are largely inaccessible except through geophysics or boreholes. In contrast, palaeo-ice sheet beds are accessible, and typically characterised by numerous bedforms. However, the interaction between bedforms and ice flow is poorly constrained and it is not clear how bedform sizes might reflect ice flow conditions. To better understand this link we present a first exploration of a variety of statistical models to explain the size distribution of some common subglacial bedforms (i.e., drumlins, ribbed moraine, MSGL). By considering a range of models, constructed to reflect key aspects of the physical processes, it is possible to infer that the size distributions are most effectively explained when the dynamics of ice-water-sediment interaction associated with bedform growth is fundamentally random. A 'stochastic instability' (SI) model, which integrates random bedform growth and shrinking through time with exponential growth, is preferred and is consistent with other observations of palaeo-bedforms and geophysical surveys of active ice sheets. Furthermore, we give a proof-of-concept demonstration that our statistical approach can bridge the gap between geomorphological observations and physical models, directly linking measurable size-frequency parameters to properties of ice sheet flow (e.g., ice velocity). Moreover, statistically developing existing models as proposed allows quantitative predictions to be made about sizes, making the models testable; a first illustration of this is given for a hypothesised repeat geophysical survey of bedforms under active ice. Thus, we further demonstrate the potential of size-frequency distributions of subglacial bedforms to assist the elucidation of subglacial processes and better constrain ice sheet models.

  14. Sampling methods to the statistical control of the production of blood components.

    PubMed

    Pereira, Paulo; Seghatchian, Jerard; Caldeira, Beatriz; Santos, Paula; Castro, Rosa; Fernandes, Teresa; Xavier, Sandra; de Sousa, Gracinda; de Almeida E Sousa, João Paulo

    2017-12-01

    The control of blood components specifications is a requirement generalized in Europe by the European Commission Directives and in the US by the AABB standards. The use of a statistical process control methodology is recommended in the related literature, including the EDQM guideline. The control reliability is dependent of the sampling. However, a correct sampling methodology seems not to be systematically applied. Commonly, the sampling is intended to comply uniquely with the 1% specification to the produced blood components. Nevertheless, on a purely statistical viewpoint, this model could be argued not to be related to a consistent sampling technique. This could be a severe limitation to detect abnormal patterns and to assure that the production has a non-significant probability of producing nonconforming components. This article discusses what is happening in blood establishments. Three statistical methodologies are proposed: simple random sampling, sampling based on the proportion of a finite population, and sampling based on the inspection level. The empirical results demonstrate that these models are practicable in blood establishments contributing to the robustness of sampling and related statistical process control decisions for the purpose they are suggested for. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Statistical analysis plan for the family-led rehabilitation after stroke in India (ATTEND) trial: A multicenter randomized controlled trial of a new model of stroke rehabilitation compared to usual care.

    PubMed

    Billot, Laurent; Lindley, Richard I; Harvey, Lisa A; Maulik, Pallab K; Hackett, Maree L; Murthy, Gudlavalleti Vs; Anderson, Craig S; Shamanna, Bindiganavale R; Jan, Stephen; Walker, Marion; Forster, Anne; Langhorne, Peter; Verma, Shweta J; Felix, Cynthia; Alim, Mohammed; Gandhi, Dorcas Bc; Pandian, Jeyaraj Durai

    2017-02-01

    Background In low- and middle-income countries, few patients receive organized rehabilitation after stroke, yet the burden of chronic diseases such as stroke is increasing in these countries. Affordable models of effective rehabilitation could have a major impact. The ATTEND trial is evaluating a family-led caregiver delivered rehabilitation program after stroke. Objective To publish the detailed statistical analysis plan for the ATTEND trial prior to trial unblinding. Methods Based upon the published registration and protocol, the blinded steering committee and management team, led by the trial statistician, have developed a statistical analysis plan. The plan has been informed by the chosen outcome measures, the data collection forms and knowledge of key baseline data. Results The resulting statistical analysis plan is consistent with best practice and will allow open and transparent reporting. Conclusions Publication of the trial statistical analysis plan reduces potential bias in trial reporting, and clearly outlines pre-specified analyses. Clinical Trial Registrations India CTRI/2013/04/003557; Australian New Zealand Clinical Trials Registry ACTRN1261000078752; Universal Trial Number U1111-1138-6707.

  16. An outline of graphical Markov models in dentistry.

    PubMed

    Helfenstein, U; Steiner, M; Menghini, G

    1999-12-01

    In the usual multiple regression model there is one response variable and one block of several explanatory variables. In contrast, in reality there may be a block of several possibly interacting response variables one would like to explain. In addition, the explanatory variables may split into a sequence of several blocks, each block containing several interacting variables. The variables in the second block are explained by those in the first block; the variables in the third block by those in the first and the second block etc. During recent years methods have been developed allowing analysis of problems where the data set has the above complex structure. The models involved are called graphical models or graphical Markov models. The main result of an analysis is a picture, a conditional independence graph with precise statistical meaning, consisting of circles representing variables and lines or arrows representing significant conditional associations. The absence of a line between two circles signifies that the corresponding two variables are independent conditional on the presence of other variables in the model. An example from epidemiology is presented in order to demonstrate application and use of the models. The data set in the example has a complex structure consisting of successive blocks: the variable in the first block is year of investigation; the variables in the second block are age and gender; the variables in the third block are indices of calculus, gingivitis and mutans streptococci and the final response variables in the fourth block are different indices of caries. Since the statistical methods may not be easily accessible to dentists, this article presents them in an introductory form. Graphical models may be of great value to dentists in allowing analysis and visualisation of complex structured multivariate data sets consisting of a sequence of blocks of interacting variables and, in particular, several possibly interacting responses in the final block.

  17. Style consistent classification of isogenous patterns.

    PubMed

    Sarkar, Prateek; Nagy, George

    2005-01-01

    In many applications of pattern recognition, patterns appear together in groups (fields) that have a common origin. For example, a printed word is usually a field of character patterns printed in the same font. A common origin induces consistency of style in features measured on patterns. The features of patterns co-occurring in a field are statistically dependent because they share the same, albeit unknown, style. Style constrained classifiers achieve higher classification accuracy by modeling such dependence among patterns in a field. Effects of style consistency on the distributions of field-features (concatenation of pattern features) can be modeled by hierarchical mixtures. Each field derives from a mixture of styles, while, within a field, a pattern derives from a class-style conditional mixture of Gaussians. Based on this model, an optimal style constrained classifier processes entire fields of patterns rendered in a consistent but unknown style. In a laboratory experiment, style constrained classification reduced errors on fields of printed digits by nearly 25 percent over singlet classifiers. Longer fields favor our classification method because they furnish more information about the underlying style.

  18. A Comparison of Imputation Methods for Bayesian Factor Analysis Models

    ERIC Educational Resources Information Center

    Merkle, Edgar C.

    2011-01-01

    Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…

  19. Explanation of Two Anomalous Results in Statistical Mediation Analysis

    ERIC Educational Resources Information Center

    Fritz, Matthew S.; Taylor, Aaron B.; MacKinnon, David P.

    2012-01-01

    Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special…

  20. Data processing of qualitative results from an interlaboratory comparison for the detection of “Flavescence dorée” phytoplasma: How the use of statistics can improve the reliability of the method validation process in plant pathology

    PubMed Central

    Renaudin, Isabelle; Poliakoff, Françoise

    2017-01-01

    A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of “Flavescence dorée” (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes’ theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods. PMID:28384335

  1. Data processing of qualitative results from an interlaboratory comparison for the detection of "Flavescence dorée" phytoplasma: How the use of statistics can improve the reliability of the method validation process in plant pathology.

    PubMed

    Chabirand, Aude; Loiseau, Marianne; Renaudin, Isabelle; Poliakoff, Françoise

    2017-01-01

    A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of "Flavescence dorée" (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes' theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods.

  2. SEDA: A software package for the Statistical Earthquake Data Analysis

    NASA Astrophysics Data System (ADS)

    Lombardi, A. M.

    2017-03-01

    In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package.

  3. SEDA: A software package for the Statistical Earthquake Data Analysis

    PubMed Central

    Lombardi, A. M.

    2017-01-01

    In this paper, the first version of the software SEDA (SEDAv1.0), designed to help seismologists statistically analyze earthquake data, is presented. The package consists of a user-friendly Matlab-based interface, which allows the user to easily interact with the application, and a computational core of Fortran codes, to guarantee the maximum speed. The primary factor driving the development of SEDA is to guarantee the research reproducibility, which is a growing movement among scientists and highly recommended by the most important scientific journals. SEDAv1.0 is mainly devoted to produce accurate and fast outputs. Less care has been taken for the graphic appeal, which will be improved in the future. The main part of SEDAv1.0 is devoted to the ETAS modeling. SEDAv1.0 contains a set of consistent tools on ETAS, allowing the estimation of parameters, the testing of model on data, the simulation of catalogs, the identification of sequences and forecasts calculation. The peculiarities of routines inside SEDAv1.0 are discussed in this paper. More specific details on the software are presented in the manual accompanying the program package. PMID:28290482

  4. Generating action descriptions from statistically integrated representations of human motions and sentences.

    PubMed

    Takano, Wataru; Kusajima, Ikuo; Nakamura, Yoshihiko

    2016-08-01

    It is desirable for robots to be able to linguistically understand human actions during human-robot interactions. Previous research has developed frameworks for encoding human full body motion into model parameters and for classifying motion into specific categories. For full understanding, the motion categories need to be connected to the natural language such that the robots can interpret human motions as linguistic expressions. This paper proposes a novel framework for integrating observation of human motion with that of natural language. This framework consists of two models; the first model statistically learns the relations between motions and their relevant words, and the second statistically learns sentence structures as word n-grams. Integration of these two models allows robots to generate sentences from human motions by searching for words relevant to the motion using the first model and then arranging these words in appropriate order using the second model. This allows making sentences that are the most likely to be generated from the motion. The proposed framework was tested on human full body motion measured by an optical motion capture system. In this, descriptive sentences were manually attached to the motions, and the validity of the system was demonstrated. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Heavy flavor decay of Zγ at CDF

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Timothy M. Harrington-Taber

    2013-01-01

    Diboson production is an important and frequently measured parameter of the Standard Model. This analysis considers the previously neglected pmore » $$\\bar{p}$$ →Z γ→ b$$\\bar{b}$$ channel, as measured at the Collider Detector at Fermilab. Using the entire Tevatron Run II dataset, the measured result is consistent with Standard Model predictions, but the statistical error associated with this method of measurement limits the strength of this correlation.« less

  6. Governance and Regional Variation of Homicide Rates: Evidence From Cross-National Data.

    PubMed

    Cao, Liqun; Zhang, Yan

    2017-01-01

    Criminological theories of cross-national studies of homicide have underestimated the effects of quality governance of liberal democracy and region. Data sets from several sources are combined and a comprehensive model of homicide is proposed. Results of the spatial regression model, which controls for the effect of spatial autocorrelation, show that quality governance, human development, economic inequality, and ethnic heterogeneity are statistically significant in predicting homicide. In addition, regions of Latin America and non-Muslim Sub-Saharan Africa have significantly higher rates of homicides ceteris paribus while the effects of East Asian countries and Islamic societies are not statistically significant. These findings are consistent with the expectation of the new modernization and regional theories. © The Author(s) 2015.

  7. A Statistical Model of the Fluctuations in the Geomagnetic Field from Paleosecular Variation to Reversal

    PubMed

    Camps; Prevot

    1996-08-09

    The statistical characteristics of the local magnetic field of Earth during paleosecular variation, excursions, and reversals are described on the basis of a database that gathers the cleaned mean direction and average remanent intensity of 2741 lava flows that have erupted over the last 20 million years. A model consisting of a normally distributed axial dipole component plus an independent isotropic set of vectors with a Maxwellian distribution that simulates secular variation fits the range of geomagnetic fluctuations, in terms of both direction and intensity. This result suggests that the magnitude of secular variation vectors is independent of the magnitude of Earth's axial dipole moment and that the amplitude of secular variation is unchanged during reversals.

  8. Many-body localization in a long range XXZ model with random-field

    NASA Astrophysics Data System (ADS)

    Li, Bo

    2016-12-01

    Many-body localization (MBL) in a long range interaction XXZ model with random field are investigated. Using the exact diagonal method, the MBL phase diagram with different tuning parameters and interaction range is obtained. It is found that the phase diagram of finite size results supplies strong evidence to confirm that the threshold interaction exponent α = 2. The tuning parameter Δ can efficiently change the MBL edge in high energy density stats, thus the system can be controlled to transfer from thermal phase to MBL phase by changing Δ. The energy level statistics data are consistent with result of the MBL phase diagram. However energy level statistics data cannot detect the thermal phase correctly in extreme long range case.

  9. Distinguishing synchronous and time-varying synergies using point process interval statistics: motor primitives in frog and rat

    PubMed Central

    Hart, Corey B.; Giszter, Simon F.

    2013-01-01

    We present and apply a method that uses point process statistics to discriminate the forms of synergies in motor pattern data, prior to explicit synergy extraction. The method uses electromyogram (EMG) pulse peak timing or onset timing. Peak timing is preferable in complex patterns where pulse onsets may be overlapping. An interval statistic derived from the point processes of EMG peak timings distinguishes time-varying synergies from synchronous synergies (SS). Model data shows that the statistic is robust for most conditions. Its application to both frog hindlimb EMG and rat locomotion hindlimb EMG show data from these preparations is clearly most consistent with synchronous synergy models (p < 0.001). Additional direct tests of pulse and interval relations in frog data further bolster the support for synchronous synergy mechanisms in these data. Our method and analyses support separated control of rhythm and pattern of motor primitives, with the low level execution primitives comprising pulsed SS in both frog and rat, and both episodic and rhythmic behaviors. PMID:23675341

  10. Generalised Central Limit Theorems for Growth Rate Distribution of Complex Systems

    NASA Astrophysics Data System (ADS)

    Takayasu, Misako; Watanabe, Hayafumi; Takayasu, Hideki

    2014-04-01

    We introduce a solvable model of randomly growing systems consisting of many independent subunits. Scaling relations and growth rate distributions in the limit of infinite subunits are analysed theoretically. Various types of scaling properties and distributions reported for growth rates of complex systems in a variety of fields can be derived from this basic physical model. Statistical data of growth rates for about 1 million business firms are analysed as a real-world example of randomly growing systems. Not only are the scaling relations consistent with the theoretical solution, but the entire functional form of the growth rate distribution is fitted with a theoretical distribution that has a power-law tail.

  11. Pressure calculation in hybrid particle-field simulations

    NASA Astrophysics Data System (ADS)

    Milano, Giuseppe; Kawakatsu, Toshihiro

    2010-12-01

    In the framework of a recently developed scheme for a hybrid particle-field simulation techniques where self-consistent field (SCF) theory and particle models (molecular dynamics) are combined [J. Chem. Phys. 130, 214106 (2009)], we developed a general formulation for the calculation of instantaneous pressure and stress tensor. The expressions have been derived from statistical mechanical definition of the pressure starting from the expression for the free energy functional in the SCF theory. An implementation of the derived formulation suitable for hybrid particle-field molecular dynamics-self-consistent field simulations is described. A series of test simulations on model systems are reported comparing the calculated pressure with those obtained from standard molecular dynamics simulations based on pair potentials.

  12. Numerical modelling of instantaneous plate tectonics

    NASA Technical Reports Server (NTRS)

    Minster, J. B.; Haines, E.; Jordan, T. H.; Molnar, P.

    1974-01-01

    Assuming lithospheric plates to be rigid, 68 spreading rates, 62 fracture zones trends, and 106 earthquake slip vectors are systematically inverted to obtain a self-consistent model of instantaneous relative motions for eleven major plates. The inverse problem is linearized and solved iteratively by a maximum-likelihood procedure. Because the uncertainties in the data are small, Gaussian statistics are shown to be adequate. The use of a linear theory permits (1) the calculation of the uncertainties in the various angular velocity vectors caused by uncertainties in the data, and (2) quantitative examination of the distribution of information within the data set. The existence of a self-consistent model satisfying all the data is strong justification of the rigid plate assumption. Slow movement between North and South America is shown to be resolvable.

  13. Chern-Simons Term: Theory and Applications.

    NASA Astrophysics Data System (ADS)

    Gupta, Kumar Sankar

    1992-01-01

    We investigate the quantization and applications of Chern-Simons theories to several systems of interest. Elementary canonical methods are employed for the quantization of abelian and nonabelian Chern-Simons actions using ideas from gauge theories and quantum gravity. When the spatial slice is a disc, it yields quantum states at the edge of the disc carrying a representation of the Kac-Moody algebra. We next include sources in this model and their quantum states are shown to be those of a conformal family. Vertex operators for both abelian and nonabelian sources are constructed. The regularized abelian Wilson line is proved to be a vertex operator. The spin-statistics theorem is established for Chern-Simons dynamics using purely geometrical techniques. Chern-Simons action is associated with exotic spin and statistics in 2 + 1 dimensions. We study several systems in which the Chern-Simons action affects the spin and statistics. The first class of systems we study consist of G/H models. The solitons of these models are shown to obey anyonic statistics in the presence of a Chern-Simons term. The second system deals with the effect of the Chern -Simons term in a model for high temperature superconductivity. The coefficient of the Chern-Simons term is shown to be quantized, one of its possible values giving fermionic statistics to the solitons of this model. Finally, we study a system of spinning particles interacting with 2 + 1 gravity, the latter being described by an ISO(2,1) Chern-Simons term. An effective action for the particles is obtained by integrating out the gauge fields. Next we construct operators which exchange the particles. They are shown to satisfy the braid relations. There are ambiguities in the quantization of this system which can be exploited to give anyonic statistics to the particles. We also point out that at the level of the first quantized theory, the usual spin-statistics relation need not apply to these particles.

  14. Providing peak river flow statistics and forecasting in the Niger River basin

    NASA Astrophysics Data System (ADS)

    Andersson, Jafet C. M.; Ali, Abdou; Arheimer, Berit; Gustafsson, David; Minoungou, Bernard

    2017-08-01

    Flooding is a growing concern in West Africa. Improved quantification of discharge extremes and associated uncertainties is needed to improve infrastructure design, and operational forecasting is needed to provide timely warnings. In this study, we use discharge observations, a hydrological model (Niger-HYPE) and extreme value analysis to estimate peak river flow statistics (e.g. the discharge magnitude with a 100-year return period) across the Niger River basin. To test the model's capacity of predicting peak flows, we compared 30-year maximum discharge and peak flow statistics derived from the model vs. derived from nine observation stations. The results indicate that the model simulates peak discharge reasonably well (on average + 20%). However, the peak flow statistics have a large uncertainty range, which ought to be considered in infrastructure design. We then applied the methodology to derive basin-wide maps of peak flow statistics and their associated uncertainty. The results indicate that the method is applicable across the hydrologically active part of the river basin, and that the uncertainty varies substantially depending on location. Subsequently, we used the most recent bias-corrected climate projections to analyze potential changes in peak flow statistics in a changed climate. The results are generally ambiguous, with consistent changes only in very few areas. To test the forecasting capacity, we ran Niger-HYPE with a combination of meteorological data sets for the 2008 high-flow season and compared with observations. The results indicate reasonable forecasting capacity (on average 17% deviation), but additional years should also be evaluated. We finish by presenting a strategy and pilot project which will develop an operational flood monitoring and forecasting system based in-situ data, earth observations, modelling, and extreme statistics. In this way we aim to build capacity to ultimately improve resilience toward floods, protecting lives and infrastructure in the region.

  15. Effects of electric field methods on modeling the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics

    DOE PAGES

    Yu, Yiqun; Jordanova, Vania Koleva; Ridley, Aaron J.; ...

    2017-05-10

    Here, we report a self-consistent electric field coupling between the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics represented in a kinetic ring current model. This implementation in the model features another self-consistency in addition to its already existing self-consistent magnetic field coupling with plasma. The model is therefore named as Ring current-Atmosphere interaction Model with Self-Consistent magnetic (B) and electric (E) fields, or RAM-SCB-E. With this new model, we explore, by comparing with previously employed empirical Weimer potential, the impact of using self-consistent electric fields on the modeling of storm time global electric potential distribution, plasma sheet particle injection, andmore » the subauroral polarization streams (SAPS) which heavily rely on the coupled interplay between the inner magnetosphere and midlatitude ionosphere. We find the following phenomena in the self-consistent model: (1) The spatially localized enhancement of electric field is produced within 2.5 < L < 4 during geomagnetic active time in the dusk-premidnight sector, with a similar dynamic penetration as found in statistical observations. (2) The electric potential contours show more substantial skewing toward the postmidnight than the Weimer potential, suggesting the resistance on the particles from directly injecting toward the low-L region. (3) The proton flux indeed indicates that the plasma sheet inner boundary at the dusk-premidnight sector is located further away from the Earth than in the Weimer potential, and a “tongue” of low-energy protons extends eastward toward the dawn, leading to the Harang reversal. (4) SAPS are reproduced in the subauroral region, and their magnitude and latitudinal width are in reasonable agreement with data.« less

  16. Effects of electric field methods on modeling the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Yiqun; Jordanova, Vania Koleva; Ridley, Aaron J.

    Here, we report a self-consistent electric field coupling between the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics represented in a kinetic ring current model. This implementation in the model features another self-consistency in addition to its already existing self-consistent magnetic field coupling with plasma. The model is therefore named as Ring current-Atmosphere interaction Model with Self-Consistent magnetic (B) and electric (E) fields, or RAM-SCB-E. With this new model, we explore, by comparing with previously employed empirical Weimer potential, the impact of using self-consistent electric fields on the modeling of storm time global electric potential distribution, plasma sheet particle injection, andmore » the subauroral polarization streams (SAPS) which heavily rely on the coupled interplay between the inner magnetosphere and midlatitude ionosphere. We find the following phenomena in the self-consistent model: (1) The spatially localized enhancement of electric field is produced within 2.5 < L < 4 during geomagnetic active time in the dusk-premidnight sector, with a similar dynamic penetration as found in statistical observations. (2) The electric potential contours show more substantial skewing toward the postmidnight than the Weimer potential, suggesting the resistance on the particles from directly injecting toward the low-L region. (3) The proton flux indeed indicates that the plasma sheet inner boundary at the dusk-premidnight sector is located further away from the Earth than in the Weimer potential, and a “tongue” of low-energy protons extends eastward toward the dawn, leading to the Harang reversal. (4) SAPS are reproduced in the subauroral region, and their magnitude and latitudinal width are in reasonable agreement with data.« less

  17. Effects of electric field methods on modeling the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics

    NASA Astrophysics Data System (ADS)

    Yu, Yiqun; Jordanova, Vania K.; Ridley, Aaron J.; Toth, Gabor; Heelis, Roderick

    2017-05-01

    We report a self-consistent electric field coupling between the midlatitude ionospheric electrodynamics and inner magnetosphere dynamics represented in a kinetic ring current model. This implementation in the model features another self-consistency in addition to its already existing self-consistent magnetic field coupling with plasma. The model is therefore named as Ring current-Atmosphere interaction Model with Self-Consistent magnetic (B) and electric (E) fields, or RAM-SCB-E. With this new model, we explore, by comparing with previously employed empirical Weimer potential, the impact of using self-consistent electric fields on the modeling of storm time global electric potential distribution, plasma sheet particle injection, and the subauroral polarization streams (SAPS) which heavily rely on the coupled interplay between the inner magnetosphere and midlatitude ionosphere. We find the following phenomena in the self-consistent model: (1) The spatially localized enhancement of electric field is produced within 2.5 < L < 4 during geomagnetic active time in the dusk-premidnight sector, with a similar dynamic penetration as found in statistical observations. (2) The electric potential contours show more substantial skewing toward the postmidnight than the Weimer potential, suggesting the resistance on the particles from directly injecting toward the low-L region. (3) The proton flux indeed indicates that the plasma sheet inner boundary at the dusk-premidnight sector is located further away from the Earth than in the Weimer potential, and a "tongue" of low-energy protons extends eastward toward the dawn, leading to the Harang reversal. (4) SAPS are reproduced in the subauroral region, and their magnitude and latitudinal width are in reasonable agreement with data.

  18. Statistical steady states in turbulent droplet condensation

    NASA Astrophysics Data System (ADS)

    Bec, Jeremie; Krstulovic, Giorgio; Siewert, Christoph

    2017-11-01

    We investigate the general problem of turbulent condensation. Using direct numerical simulations we show that the fluctuations of the supersaturation field offer different conditions for the growth of droplets which evolve in time due to turbulent transport and mixing. This leads to propose a Lagrangian stochastic model consisting of a set of integro-differential equations for the joint evolution of the squared radius and the supersaturation along droplet trajectories. The model has two parameters fixed by the total amount of water and the thermodynamic properties, as well as the Lagrangian integral timescale of the turbulent supersaturation. The model reproduces very well the droplet size distributions obtained from direct numerical simulations and their time evolution. A noticeable result is that, after a stage where the squared radius simply diffuses, the system converges exponentially fast to a statistical steady state independent of the initial conditions. The main mechanism involved in this convergence is a loss of memory induced by a significant number of droplets undergoing a complete evaporation before growing again. The statistical steady state is characterised by an exponential tail in the droplet mass distribution.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lewis, John R.; Brooks, Dusty Marie

    In pressurized water reactors, the prevention, detection, and repair of cracks within dissimilar metal welds is essential to ensure proper plant functionality and safety. Weld residual stresses, which are difficult to model and cannot be directly measured, contribute to the formation and growth of cracks due to primary water stress corrosion cracking. Additionally, the uncertainty in weld residual stress measurements and modeling predictions is not well understood, further complicating the prediction of crack evolution. The purpose of this document is to develop methodology to quantify the uncertainty associated with weld residual stress that can be applied to modeling predictions andmore » experimental measurements. Ultimately, the results can be used to assess the current state of uncertainty and to build confidence in both modeling and experimental procedures. The methodology consists of statistically modeling the variation in the weld residual stress profiles using functional data analysis techniques. Uncertainty is quantified using statistical bounds (e.g. confidence and tolerance bounds) constructed with a semi-parametric bootstrap procedure. Such bounds describe the range in which quantities of interest, such as means, are expected to lie as evidenced by the data. The methodology is extended to provide direct comparisons between experimental measurements and modeling predictions by constructing statistical confidence bounds for the average difference between the two quantities. The statistical bounds on the average difference can be used to assess the level of agreement between measurements and predictions. The methodology is applied to experimental measurements of residual stress obtained using two strain relief measurement methods and predictions from seven finite element models developed by different organizations during a round robin study.« less

  20. Multivariate model of female black bear habitat use for a Geographic Information System

    USGS Publications Warehouse

    Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.

    1993-01-01

    Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.

  1. Statistical analysis of cascading failures in power grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chertkov, Michael; Pfitzner, Rene; Turitsyn, Konstantin

    2010-12-01

    We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systemsmore » consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.« less

  2. Sequential Markov chain Monte Carlo filter with simultaneous model selection for electrocardiogram signal modeling.

    PubMed

    Edla, Shwetha; Kovvali, Narayan; Papandreou-Suppappola, Antonia

    2012-01-01

    Constructing statistical models of electrocardiogram (ECG) signals, whose parameters can be used for automated disease classification, is of great importance in precluding manual annotation and providing prompt diagnosis of cardiac diseases. ECG signals consist of several segments with different morphologies (namely the P wave, QRS complex and the T wave) in a single heart beat, which can vary across individuals and diseases. Also, existing statistical ECG models exhibit a reliance upon obtaining a priori information from the ECG data by using preprocessing algorithms to initialize the filter parameters, or to define the user-specified model parameters. In this paper, we propose an ECG modeling technique using the sequential Markov chain Monte Carlo (SMCMC) filter that can perform simultaneous model selection, by adaptively choosing from different representations depending upon the nature of the data. Our results demonstrate the ability of the algorithm to track various types of ECG morphologies, including intermittently occurring ECG beats. In addition, we use the estimated model parameters as the feature set to classify between ECG signals with normal sinus rhythm and four different types of arrhythmia.

  3. Capturing spatial and temporal patterns of widespread, extreme flooding across Europe

    NASA Astrophysics Data System (ADS)

    Busby, Kathryn; Raven, Emma; Liu, Ye

    2013-04-01

    Statistical characterisation of physical hazards is an integral part of probabilistic catastrophe models used by the reinsurance industry to estimate losses from large scale events. Extreme flood events are not restricted by country boundaries which poses an issue for reinsurance companies as their exposures often extend beyond them. We discuss challenges and solutions that allow us to appropriately capture the spatial and temporal dependence of extreme hydrological events on a continental-scale, which in turn enables us to generate an industry-standard stochastic event set for estimating financial losses for widespread flooding. By presenting our event set methodology, we focus on explaining how extreme value theory (EVT) and dependence modelling are used to account for short, inconsistent hydrological data from different countries, and how to make appropriate statistical decisions that best characterise the nature of flooding across Europe. The consistency of input data is of vital importance when identifying historical flood patterns. Collating data from numerous sources inherently causes inconsistencies and we demonstrate our robust approach to assessing the data and refining it to compile a single consistent dataset. This dataset is then extrapolated using a parameterised EVT distribution to estimate extremes. Our method then captures the dependence of flood events across countries using an advanced multivariate extreme value model. Throughout, important statistical decisions are explored including: (1) distribution choice; (2) the threshold to apply for extracting extreme data points; (3) a regional analysis; (4) the definition of a flood event, which is often linked with reinsurance industry's hour's clause; and (5) handling of missing values. Finally, having modelled the historical patterns of flooding across Europe, we sample from this model to generate our stochastic event set comprising of thousands of events over thousands of years. We then briefly illustrate how this is applied within a probabilistic model to estimate catastrophic loss curves used by the reinsurance industry.

  4. Comparison of HSPF and PRMS model simulated flows using different temporal and spatial scales in the Black Hills, South Dakota

    USGS Publications Warehouse

    Chalise, D. R.; Haj, Adel E.; Fontaine, T.A.

    2018-01-01

    The hydrological simulation program Fortran (HSPF) [Hydrological Simulation Program Fortran version 12.2 (Computer software). USEPA, Washington, DC] and the precipitation runoff modeling system (PRMS) [Precipitation Runoff Modeling System version 4.0 (Computer software). USGS, Reston, VA] models are semidistributed, deterministic hydrological tools for simulating the impacts of precipitation, land use, and climate on basin hydrology and streamflow. Both models have been applied independently to many watersheds across the United States. This paper reports the statistical results assessing various temporal (daily, monthly, and annual) and spatial (small versus large watershed) scale biases in HSPF and PRMS simulations using two watersheds in the Black Hills, South Dakota. The Nash-Sutcliffe efficiency (NSE), Pearson correlation coefficient (r">rr), and coefficient of determination (R2">R2R2) statistics for the daily, monthly, and annual flows were used to evaluate the models’ performance. Results from the HSPF models showed that the HSPF consistently simulated the annual flows for both large and small basins better than the monthly and daily flows, and the simulated flows for the small watershed better than flows for the large watershed. In comparison, the PRMS model results show that the PRMS simulated the monthly flows for both the large and small watersheds better than the daily and annual flows, and the range of statistical error in the PRMS models was greater than that in the HSPF models. Moreover, it can be concluded that the statistical error in the HSPF and the PRMSdaily, monthly, and annual flow estimates for watersheds in the Black Hills was influenced by both temporal and spatial scale variability.

  5. Comparing the strength of behavioural plasticity and consistency across situations: animal personalities in the hermit crab Pagurus bernhardus

    PubMed Central

    Briffa, Mark; Rundle, Simon D; Fryer, Adam

    2008-01-01

    Many phenotypic traits show plasticity but behaviour is often considered the ‘most plastic’ aspect of phenotype as it is likely to show the quickest response to temporal changes in conditions or ‘situation’. However, it has also been noted that constraints on sensory acuity, cognitive structure and physiological capacities place limits on behavioural plasticity. Such limits to plasticity may generate consistent differences in behaviour between individuals from the same population. It has recently been suggested that these consistent differences in individual behaviour may be adaptive and the term ‘animal personalities’ has been used to describe them. In many cases, however, a degree of both behavioural plasticity and relative consistency is probable. To understand the possible functions of animal personalities, it is necessary to determine the relative strength of each tendency and this may be achieved by comparison of statistical effect sizes for tests of difference and concordance. Here, we describe a new statistical framework for making such comparisons and investigate cross-situational plasticity and consistency in the duration of startle responses in the European hermit crab Pagurus bernhardus, in the field and the laboratory. The effect sizes of tests for behavioural consistency were greater than for tests of behavioural plasticity, indicating for the first time the presence of animal personalities in a crustacean model. PMID:18331983

  6. A Method of Relating General Circulation Model Simulated Climate to the Observed Local Climate. Part I: Seasonal Statistics.

    NASA Astrophysics Data System (ADS)

    Karl, Thomas R.; Wang, Wei-Chyung; Schlesinger, Michael E.; Knight, Richard W.; Portman, David

    1990-10-01

    Important surface observations such as the daily maximum and minimum temperature, daily precipitation, and cloud ceilings often have localized characteristics that are difficult to reproduce with the current resolution and the physical parameterizations in state-of-the-art General Circulation climate Models (GCMs). Many of the difficulties can be partially attributed to mismatches in scale, local topography. regional geography and boundary conditions between models and surface-based observations. Here, we present a method, called climatological projection by model statistics (CPMS), to relate GCM grid-point flee-atmosphere statistics, the predictors, to these important local surface observations. The method can be viewed as a generalization of the model output statistics (MOS) and perfect prog (PP) procedures used in numerical weather prediction (NWP) models. It consists of the application of three statistical methods: 1) principle component analysis (FICA), 2) canonical correlation, and 3) inflated regression analysis. The PCA reduces the redundancy of the predictors The canonical correlation is used to develop simultaneous relationships between linear combinations of the predictors, the canonical variables, and the surface-based observations. Finally, inflated regression is used to relate the important canonical variables to each of the surface-based observed variables.We demonstrate that even an early version of the Oregon State University two-level atmospheric GCM (with prescribed sea surface temperature) produces free-atmosphere statistics than can, when standardized using the model's internal means and variances (the MOS-like version of CPMS), closely approximate the observed local climate. When the model data are standardized by the observed free-atmosphere means and variances (the PP version of CPMS), however, the model does not reproduce the observed surface climate as well. Our results indicate that in the MOS-like version of CPMS the differences between the output of a ten-year GCM control run and the surface-based observations are often smaller than the differences between the observations of two ten-year periods. Such positive results suggest that GCMs may already contain important climatological information that can be used to infer the local climate.

  7. The different varieties of the Suyama-Yamaguchi consistency relation and its violation as a signal of statistical inhomogeneity

    NASA Astrophysics Data System (ADS)

    Rodríguez, Yeinzon; Beltrán Almeida, Juan P.; Valenzuela-Toledo, César A.

    2013-04-01

    We present the different consistency relations that can be seen as variations of the well known Suyama-Yamaguchi (SY) consistency relation τNL>=((6/5)fNL)2, the latter involving the levels of non-gaussianity fNL and τNL in the primordial curvature perturbation ζ. It has been (implicitly) claimed that the following variation: τNL(k1,k3)>=((6/5))2fNL(k1)fNL(k3), which we call ``the fourth variety'', in the collapsed (for τNL) and squeezed (for fNL) limits is always satisfied independently of any physics; however, the proof depends sensitively on the assumption of scale-invariance (expressing this way the fourth variety of the SY consistency relation as τNL>=((6/5)fNL)2) which only applies for cosmological models involving Lorentz-invariant scalar fields (at least at tree level), leaving room for a strong violation of this variety of the consistency relation when non-trivial degrees of freedom, for instance vector fields, are in charge of the generation of the primordial curvature perturbation. With this in mind as a motivation, we explicitly state, in the first part of this work, under which conditions the SY consistency relation has been claimed to hold in its different varieties (implicitly) presented in the literature since its inception back in 2008; as a result, we show for the first time that the variety τNL(k1,k1)>=((6/5)fNL(k1))2, which we call ``the fifth variety'', is always satisfied even when there is strong scale-dependence and high levels of statistical anisotropy as long as statistical homogeneity holds: thus, an observed violation of this specific variety would prevent the comparison between theory and observation, shaking this way the foundations of cosmology as a science. In the second part, we concern about the existence of non-trivial degrees of freedom, concretely vector fields for which the levels of non-gaussianity have been calculated for very few models; among them, and by making use of the δN formalism at tree level, we study a class of models that includes the vector curvaton scenario, vector inflation, and the hybrid inflation with coupled vector and scalar ``waterfall field'' where ζ is generated at the end of inflation, finding that the fourth variety of the SY consistency relation is indeed strongly violated for some specific wavevector configurations while the fifth variety continues to be well satisfied. Finally, as a byproduct of our investigation, we draw attention to a quite recently demonstrated variety of the SY consistency relation: τisoNL>=((6/5)fisoNL)2, in scenarios where scalar and vector fields contribute to the generation of the primordial curvature perturbation; this variety of the SY consistency relation is satisfied although the isotropic pieces of the non-gaussianity parameters receive contributions from the vector fields. We discuss further implications for observational cosmology.

  8. Applications of the DOE/NASA wind turbine engineering information system

    NASA Technical Reports Server (NTRS)

    Neustadter, H. E.; Spera, D. A.

    1981-01-01

    A statistical analysis of data obtained from the Technology and Engineering Information Systems was made. The systems analyzed consist of the following elements: (1) sensors which measure critical parameters (e.g., wind speed and direction, output power, blade loads and component vibrations); (2) remote multiplexing units (RMUs) on each wind turbine which frequency-modulate, multiplex and transmit sensor outputs; (3) on-site instrumentation to record, process and display the sensor output; and (4) statistical analysis of data. Two examples of the capabilities of these systems are presented. The first illustrates the standardized format for application of statistical analysis to each directly measured parameter. The second shows the use of a model to estimate the variability of the rotor thrust loading, which is a derived parameter.

  9. Statistical auditing and randomness test of lotto k/N-type games

    NASA Astrophysics Data System (ADS)

    Coronel-Brizio, H. F.; Hernández-Montoya, A. R.; Rapallo, F.; Scalas, E.

    2008-11-01

    One of the most popular lottery games worldwide is the so-called “lotto k/N”. It considers N numbers 1,2,…,N from which k are drawn randomly, without replacement. A player selects k or more numbers and the first prize is shared amongst those players whose selected numbers match all of the k randomly drawn. Exact rules may vary in different countries. In this paper, mean values and covariances for the random variables representing the numbers drawn from this kind of game are presented, with the aim of using them to audit statistically the consistency of a given sample of historical results with theoretical values coming from a hypergeometric statistical model. The method can be adapted to test pseudorandom number generators.

  10. Influences of credibility of testimony and strength of statistical evidence on children’s and adolescents’ reasoning

    PubMed Central

    Kail, Robert V.

    2013-01-01

    According to dual-process models that include analytic and heuristic modes of processing, analytic processing is often expected to become more common with development. Consistent with this view, on reasoning problems, adolescents are more likely than children to select alternatives that are backed by statistical evidence. It is shown here that this pattern depends on the quality of the statistical evidence and the quality of the testimonial that is the typical alternative to statistical evidence. In Experiment 1, 9- and 13-year-olds (N = 64) were presented with scenarios in which solid statistical evidence was contrasted with casual or expert testimonial evidence. When testimony was casual, children relied on it but adolescents did not; when testimony was expert, both children and adolescents relied on it. In Experiment 2, 9- and 13-year-olds (N = 83) were presented with scenarios in which casual testimonial evidence was contrasted with weak or strong statistical evidence. When statistical evidence was weak, children and adolescents relied on both testimonial and statistical evidence; when statistical evidence was strong, most children and adolescents relied on it. Results are discussed in terms of their implications for dual-process accounts of cognitive development. PMID:23735681

  11. Eulerian and Lagrangian Parameterization of the Oceanic Mixed Layer using Large Eddy Simulation and MPAS-Ocean

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Van Roekel, Luke

    We have conducted a suite of Large Eddy Simulation (LES) to form the basis of a multi-model comparison (left). The results have led to proposed model improvements. We have verified that Eulerian-Lagrangian effective diffusivity estimates of mesoscale mixing are consistent with traditional particle statistics metrics (right). LES and Lagrangian particles will be utilized to better represent the movement of water into and out of the mixed layer.

  12. Evaluation of NMME temperature and precipitation bias and forecast skill for South Asia

    NASA Astrophysics Data System (ADS)

    Cash, Benjamin A.; Manganello, Julia V.; Kinter, James L.

    2017-08-01

    Systematic error and forecast skill for temperature and precipitation in two regions of Southern Asia are investigated using hindcasts initialized May 1 from the North American Multi-Model Ensemble. We focus on two contiguous but geographically and dynamically diverse regions: the Extended Indian Monsoon Rainfall (70-100E, 10-30 N) and the nearby mountainous area of Pakistan and Afghanistan (60-75E, 23-39 N). Forecast skill is assessed using the Sign test framework, a rigorous statistical method that can be applied to non-Gaussian variables such as precipitation and to different ensemble sizes without introducing bias. We find that models show significant systematic error in both precipitation and temperature for both regions. The multi-model ensemble mean (MMEM) consistently yields the lowest systematic error and the highest forecast skill for both regions and variables. However, we also find that the MMEM consistently provides a statistically significant increase in skill over climatology only in the first month of the forecast. While the MMEM tends to provide higher overall skill than climatology later in the forecast, the differences are not significant at the 95% level. We also find that MMEMs constructed with a relatively small number of ensemble members per model can equal or outperform MMEMs constructed with more members in skill. This suggests some ensemble members either provide no contribution to overall skill or even detract from it.

  13. Statistics of polymer extensions in turbulent channel flow.

    PubMed

    Bagheri, Faranggis; Mitra, Dhrubaditya; Perlekar, Prasad; Brandt, Luca

    2012-11-01

    We present direct numerical simulations of turbulent channel flow with passive Lagrangian polymers. To understand the polymer behavior we investigate the behavior of infinitesimal line elements and calculate the probability distribution function (PDF) of finite-time Lyapunov exponents and from them the corresponding Cramer's function for the channel flow. We study the statistics of polymer elongation for both the Oldroyd-B model (for Weissenberg number Wi<1) and the FENE model. We use the location of the minima of the Cramer's function to define the Weissenberg number precisely such that we observe coil-stretch transition at Wi ≈1. We find agreement with earlier analytical predictions for PDF of polymer extensions made by Balkovsky, Fouxon, and Lebedev [Phys. Rev. Lett. 84, 4765 (2000)] for linear polymers (Oldroyd-B model) with Wi <1 and by Chertkov [Phys. Rev. Lett. 84, 4761 (2000)] for nonlinear FENE-P model of polymers. For Wi >1 (FENE model) the polymer are significantly more stretched near the wall than at the center of the channel where the flow is closer to homogenous isotropic turbulence. Furthermore near the wall the polymers show a strong tendency to orient along the streamwise direction of the flow, but near the center line the statistics of orientation of the polymers is consistent with analogous results obtained recently in homogeneous and isotropic flows.

  14. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts.

    PubMed

    Preisser, John S; Long, D Leann; Stamm, John W

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. © 2017 S. Karger AG, Basel.

  15. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts

    PubMed Central

    Preisser, John S.; Long, D. Leann; Stamm, John W.

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two datasets, one consisting of fictional dmft counts in two groups and the other on DMFS among schoolchildren from a randomized clinical trial (RCT) comparing three toothpaste formulations to prevent incident dental caries, are analysed with negative binomial hurdle (NBH), zero-inflated negative binomial (ZINB), and marginalized zero-inflated negative binomial (MZINB) models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the RCT were similar despite their distinctive interpretations. Choice of statistical model class should match the study’s purpose, while accounting for the broad decline in children’s caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. PMID:28291962

  16. Order statistics applied to the most massive and most distant galaxy clusters

    NASA Astrophysics Data System (ADS)

    Waizmann, J.-C.; Ettori, S.; Bartelmann, M.

    2013-06-01

    In this work, we present an analytic framework for calculating the individual and joint distributions of the nth most massive or nth highest redshift galaxy cluster for a given survey characteristic allowing us to formulate Λ cold dark matter (ΛCDM) exclusion criteria. We show that the cumulative distribution functions steepen with increasing order, giving them a higher constraining power with respect to the extreme value statistics. Additionally, we find that the order statistics in mass (being dominated by clusters at lower redshifts) is sensitive to the matter density and the normalization of the matter fluctuations, whereas the order statistics in redshift is particularly sensitive to the geometric evolution of the Universe. For a fixed cosmology, both order statistics are efficient probes of the functional shape of the mass function at the high-mass end. To allow a quick assessment of both order statistics, we provide fits as a function of the survey area that allow percentile estimation with an accuracy better than 2 per cent. Furthermore, we discuss the joint distributions in the two-dimensional case and find that for the combination of the largest and the second largest observation, it is most likely to find them to be realized with similar values with a broadly peaked distribution. When combining the largest observation with higher orders, it is more likely to find a larger gap between the observations and when combining higher orders in general, the joint probability density function peaks more strongly. Having introduced the theory, we apply the order statistical analysis to the Southpole Telescope (SPT) massive cluster sample and metacatalogue of X-ray detected clusters of galaxies catalogue and find that the 10 most massive clusters in the sample are consistent with ΛCDM and the Tinker mass function. For the order statistics in redshift, we find a discrepancy between the data and the theoretical distributions, which could in principle indicate a deviation from the standard cosmology. However, we attribute this deviation to the uncertainty in the modelling of the SPT survey selection function. In turn, by assuming the ΛCDM reference cosmology, order statistics can also be utilized for consistency checks of the completeness of the observed sample and of the modelling of the survey selection function.

  17. Artificial neural network models for prediction of cardiovascular autonomic dysfunction in general Chinese population

    PubMed Central

    2013-01-01

    Background The present study aimed to develop an artificial neural network (ANN) based prediction model for cardiovascular autonomic (CA) dysfunction in the general population. Methods We analyzed a previous dataset based on a population sample consisted of 2,092 individuals aged 30–80 years. The prediction models were derived from an exploratory set using ANN analysis. Performances of these prediction models were evaluated in the validation set. Results Univariate analysis indicated that 14 risk factors showed statistically significant association with CA dysfunction (P < 0.05). The mean area under the receiver-operating curve was 0.762 (95% CI 0.732–0.793) for prediction model developed using ANN analysis. The mean sensitivity, specificity, positive and negative predictive values were similar in the prediction models was 0.751, 0.665, 0.330 and 0.924, respectively. All HL statistics were less than 15.0. Conclusion ANN is an effective tool for developing prediction models with high value for predicting CA dysfunction among the general population. PMID:23902963

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, Kandler; Shi, Ying; Santhanagopalan, Shriram

    Predictive models of Li-ion battery lifetime must consider a multiplicity of electrochemical, thermal, and mechanical degradation modes experienced by batteries in application environments. To complicate matters, Li-ion batteries can experience different degradation trajectories that depend on storage and cycling history of the application environment. Rates of degradation are controlled by factors such as temperature history, electrochemical operating window, and charge/discharge rate. We present a generalized battery life prognostic model framework for battery systems design and control. The model framework consists of trial functions that are statistically regressed to Li-ion cell life datasets wherein the cells have been aged under differentmore » levels of stress. Degradation mechanisms and rate laws dependent on temperature, storage, and cycling condition are regressed to the data, with multiple model hypotheses evaluated and the best model down-selected based on statistics. The resulting life prognostic model, implemented in state variable form, is extensible to arbitrary real-world scenarios. The model is applicable in real-time control algorithms to maximize battery life and performance. We discuss efforts to reduce lifetime prediction error and accommodate its inevitable impact in controller design.« less

  19. 3Drefine: an interactive web server for efficient protein structure refinement

    PubMed Central

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-01-01

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. PMID:27131371

  20. Are Gender Differences in Perceived and Demonstrated Technology Literacy Significant? It Depends on the Model

    ERIC Educational Resources Information Center

    Hohlfeld, Tina N.; Ritzhaupt, Albert D.; Barron, Ann E.

    2013-01-01

    This paper examines gender differences related to Information and Communication Technology (ICT) literacy using two valid and internally consistent measures with eighth grade students (N = 1,513) from Florida public schools. The results of t test statistical analyses, which examined only gender differences in demonstrated and perceived ICT skills,…

  1. Discriminative Random Field Models for Subsurface Contamination Uncertainty Quantification

    NASA Astrophysics Data System (ADS)

    Arshadi, M.; Abriola, L. M.; Miller, E. L.; De Paolis Kaluza, C.

    2017-12-01

    Application of flow and transport simulators for prediction of the release, entrapment, and persistence of dense non-aqueous phase liquids (DNAPLs) and associated contaminant plumes is a computationally intensive process that requires specification of a large number of material properties and hydrologic/chemical parameters. Given its computational burden, this direct simulation approach is particularly ill-suited for quantifying both the expected performance and uncertainty associated with candidate remediation strategies under real field conditions. Prediction uncertainties primarily arise from limited information about contaminant mass distributions, as well as the spatial distribution of subsurface hydrologic properties. Application of direct simulation to quantify uncertainty would, thus, typically require simulating multiphase flow and transport for a large number of permeability and release scenarios to collect statistics associated with remedial effectiveness, a computationally prohibitive process. The primary objective of this work is to develop and demonstrate a methodology that employs measured field data to produce equi-probable stochastic representations of a subsurface source zone that capture the spatial distribution and uncertainty associated with key features that control remediation performance (i.e., permeability and contamination mass). Here we employ probabilistic models known as discriminative random fields (DRFs) to synthesize stochastic realizations of initial mass distributions consistent with known, and typically limited, site characterization data. Using a limited number of full scale simulations as training data, a statistical model is developed for predicting the distribution of contaminant mass (e.g., DNAPL saturation and aqueous concentration) across a heterogeneous domain. Monte-Carlo sampling methods are then employed, in conjunction with the trained statistical model, to generate realizations conditioned on measured borehole data. Performance of the statistical model is illustrated through comparisons of generated realizations with the `true' numerical simulations. Finally, we demonstrate how these realizations can be used to determine statistically optimal locations for further interrogation of the subsurface.

  2. Statistical Downscaling of General Circulation Model Outputs to Precipitation Accounting for Non-Stationarities in Predictor-Predictand Relationships

    PubMed Central

    Sachindra, D. A.; Perera, B. J. C.

    2016-01-01

    This paper presents a novel approach to incorporate the non-stationarities characterised in the GCM outputs, into the Predictor-Predictand Relationships (PPRs) in statistical downscaling models. In this approach, a series of 42 PPRs based on multi-linear regression (MLR) technique were determined for each calendar month using a 20-year moving window moved at a 1-year time step on the predictor data obtained from the NCEP/NCAR reanalysis data archive and observations of precipitation at 3 stations located in Victoria, Australia, for the period 1950–2010. Then the relationships between the constants and coefficients in the PPRs and the statistics of reanalysis data of predictors were determined for the period 1950–2010, for each calendar month. Thereafter, using these relationships with the statistics of the past data of HadCM3 GCM pertaining to the predictors, new PPRs were derived for the periods 1950–69, 1970–89 and 1990–99 for each station. This process yielded a non-stationary downscaling model consisting of a PPR per calendar month for each of the above three periods for each station. The non-stationarities in the climate are characterised by the long-term changes in the statistics of the climate variables and above process enabled relating the non-stationarities in the climate to the PPRs. These new PPRs were then used with the past data of HadCM3, to reproduce the observed precipitation. It was found that the non-stationary MLR based downscaling model was able to produce more accurate simulations of observed precipitation more often than conventional stationary downscaling models developed with MLR and Genetic Programming (GP). PMID:27997609

  3. Statistical Downscaling of General Circulation Model Outputs to Precipitation Accounting for Non-Stationarities in Predictor-Predictand Relationships.

    PubMed

    Sachindra, D A; Perera, B J C

    2016-01-01

    This paper presents a novel approach to incorporate the non-stationarities characterised in the GCM outputs, into the Predictor-Predictand Relationships (PPRs) in statistical downscaling models. In this approach, a series of 42 PPRs based on multi-linear regression (MLR) technique were determined for each calendar month using a 20-year moving window moved at a 1-year time step on the predictor data obtained from the NCEP/NCAR reanalysis data archive and observations of precipitation at 3 stations located in Victoria, Australia, for the period 1950-2010. Then the relationships between the constants and coefficients in the PPRs and the statistics of reanalysis data of predictors were determined for the period 1950-2010, for each calendar month. Thereafter, using these relationships with the statistics of the past data of HadCM3 GCM pertaining to the predictors, new PPRs were derived for the periods 1950-69, 1970-89 and 1990-99 for each station. This process yielded a non-stationary downscaling model consisting of a PPR per calendar month for each of the above three periods for each station. The non-stationarities in the climate are characterised by the long-term changes in the statistics of the climate variables and above process enabled relating the non-stationarities in the climate to the PPRs. These new PPRs were then used with the past data of HadCM3, to reproduce the observed precipitation. It was found that the non-stationary MLR based downscaling model was able to produce more accurate simulations of observed precipitation more often than conventional stationary downscaling models developed with MLR and Genetic Programming (GP).

  4. Direct computational approach to lattice supersymmetric quantum mechanics

    NASA Astrophysics Data System (ADS)

    Kadoh, Daisuke; Nakayama, Katsumasa

    2018-07-01

    We study the lattice supersymmetric models numerically using the transfer matrix approach. This method consists only of deterministic processes and has no statistical uncertainties. We improve it by performing a scale transformation of variables such that the Witten index is correctly reproduced from the lattice model, and the other prescriptions are shown in detail. Compared to the precious Monte-Carlo results, we can estimate the effective masses, SUSY Ward identity and the cut-off dependence of the results in high precision. Those kinds of information are useful in improving lattice formulation of supersymmetric models.

  5. Numerical solution for weight reduction model due to health campaigns in Spain

    NASA Astrophysics Data System (ADS)

    Mohammed, Maha A.; Noor, Noor Fadiya Mohd; Siri, Zailan; Ibrahim, Adriana Irawati Nur

    2015-10-01

    Transition model between three subpopulations based on Body Mass Index of Valencia community in Spain is considered. No changes in population nutritional habits and public health strategies on weight reduction until 2030 are assumed. The system of ordinary differential equations is solved using Runge-Kutta method of higher order. The numerical results obtained are compared with the predicted values of subpopulation proportion based on statistical estimation in 2013, 2015 and 2030. Relative approximate error is calculated. The consistency of the Runge-Kutta method in solving the model is discussed.

  6. Association of a four-locus gene model including IL13, IL4, FCER1B, and ADRB2 with the Asthma Predictive Index and atopy in Chinese Han children.

    PubMed

    Bai, S; Hua, L; Wang, X; Liu, Q; Bao, Y

    2018-05-11

    Asthma is a complex and heterogeneous disease. We found that gene-gene interactions among IL13 rs20541, IL4 rs2243250, ADRB2 rs1042713, and FCER1B rs569108 in asthmatic children of Chinese Han nationality. This four-locus set constituted an optimal statistical interaction model. Objective: This study examined associations of the four-gene model consisting of IL13, IL4, FCER1B, and ADRB2 with the Asthma Predictive Index (API) and atopy in Chinese Han children. Four single-nucleotide polymorphisms (SNPs) in the four genes were genotyped in 385 preschool children with wheezing symptoms using matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Student's t test and x2 tests were used for this analysis. : Significant correlations were found between the four-locus gene model and the stringent and loose API (both P<0.0001). Additionally, a high-risk asthma genotype was a risk factor for the positive API (stringent API: OR= 4.08, loose API: OR=2.36). We also found a statistically significant association of the four-locus gene model with atopy (P<0.01, OR= 2.09). Our results indicated that the four-locus gene model consisting of L13 rs20541, IL4 rs2243250, ADRB2 rs1042713 and FCER1B rs569108 was associated with the API and atopy. These findings provide an evidence of the gene model for determining a high risk of developing asthma and atopy in Chinese Han children.

  7. Navigation analysis for Viking 1979, option B

    NASA Technical Reports Server (NTRS)

    Mitchell, P. H.

    1971-01-01

    A parametric study performed for 48 trans-Mars reference missions in support of the Viking program is reported. The launch dates cover several months in the year 1979, and each launch date has multiple arrival dates in 1980. A plot of launch versus arrival dates with case numbers designated for reference purposes is included. The analysis consists of the computation of statistical covariance matrices based on certain assumptions about the ground-based tracking systems. The error model statistics are listed in tables. Tracking systems were assumed at three sites: Goldstone, California; Canberra, Australia; and Madrid, Spain. The tracking data consisted of range and Doppler measurements taken during the tracking intervals starting at E-30(d) and ending at E-10(d) for the control data and ending at E-18(h) for the knowledge data. The control and knowledge covariance matrices were delivered to the Planetary Mission Analysis Branch for inputs into a delta V dispersion analysis.

  8. Mortality of aircraft maintenance workers exposed to trichloroethylene and other hydrocarbons and chemicals: extended follow up

    PubMed Central

    Radican, Larry; Blair, Aaron; Stewart, Patricia; Wartenberg, Daniel

    2009-01-01

    Objective To extend follow-up of 14,455 workers from 1990 to 2000, and evaluate mortality risk from exposure to trichloroethylene (TCE) and other chemicals. Methods Multivariable Cox models were used to estimate relative risk for exposed vs. unexposed workers based on previously developed exposure surrogates. Results Among TCE exposed workers, there was no statistically significant increased risk of all-cause mortality (RR=1.04) or death from all cancers (RR=1.03). Exposure-response gradients for TCE were relatively flat and did not materially change since 1990. Statistically significant excesses were found for several chemical exposure subgroups and causes, and were generally consistent with the previous follow up. Conclusions Patterns of mortality have not changed substantially since 1990. While positive associations with several cancers were observed, and are consistent with the published literature, interpretation is limited due to the small numbers of events for specific exposures. PMID:19001957

  9. A Finite-Volume "Shaving" Method for Interfacing NASA/DAO''s Physical Space Statistical Analysis System to the Finite-Volume GCM with a Lagrangian Control-Volume Vertical Coordinate

    NASA Technical Reports Server (NTRS)

    Lin, Shian-Jiann; DaSilva, Arlindo; Atlas, Robert (Technical Monitor)

    2001-01-01

    Toward the development of a finite-volume Data Assimilation System (fvDAS), a consistent finite-volume methodology is developed for interfacing the NASA/DAO's Physical Space Statistical Analysis System (PSAS) to the joint NASA/NCAR finite volume CCM3 (fvCCM3). To take advantage of the Lagrangian control-volume vertical coordinate of the fvCCM3, a novel "shaving" method is applied to the lowest few model layers to reflect the surface pressure changes as implied by the final analysis. Analysis increments (from PSAS) to the upper air variables are then consistently put onto the Lagrangian layers as adjustments to the volume-mean quantities during the analysis cycle. This approach is demonstrated to be superior to the conventional method of using independently computed "tendency terms" for surface pressure and upper air prognostic variables.

  10. On an additive partial correlation operator and nonparametric estimation of graphical models.

    PubMed

    Lee, Kuang-Yao; Li, Bing; Zhao, Hongyu

    2016-09-01

    We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance.

  11. On an additive partial correlation operator and nonparametric estimation of graphical models

    PubMed Central

    Li, Bing; Zhao, Hongyu

    2016-01-01

    Abstract We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance. PMID:29422689

  12. An information hidden model holding cover distributions

    NASA Astrophysics Data System (ADS)

    Fu, Min; Cai, Chao; Dai, Zuxu

    2018-03-01

    The goal of steganography is to embed secret data into a cover so no one apart from the sender and intended recipients can find the secret data. Usually, the way the cover changing was decided by a hidden function. There were no existing model could be used to find an optimal function which can greatly reduce the distortion the cover suffered. This paper considers the cover carrying secret message as a random Markov chain, taking the advantages of a deterministic relation between initial distributions and transferring matrix of the Markov chain, and takes the transferring matrix as a constriction to decrease statistical distortion the cover suffered in the process of information hiding. Furthermore, a hidden function is designed and the transferring matrix is also presented to be a matrix from the original cover to the stego cover. Experiment results show that the new model preserves a consistent statistical characterizations of original and stego cover.

  13. Multiple Statistical Models Based Analysis of Causative Factors and Loess Landslides in Tianshui City, China

    NASA Astrophysics Data System (ADS)

    Su, Xing; Meng, Xingmin; Ye, Weilin; Wu, Weijiang; Liu, Xingrong; Wei, Wanhong

    2018-03-01

    Tianshui City is one of the mountainous cities that are threatened by severe geo-hazards in Gansu Province, China. Statistical probability models have been widely used in analyzing and evaluating geo-hazards such as landslide. In this research, three approaches (Certainty Factor Method, Weight of Evidence Method and Information Quantity Method) were adopted to quantitively analyze the relationship between the causative factors and the landslides, respectively. The source data used in this study are including the SRTM DEM and local geological maps in the scale of 1:200,000. 12 causative factors (i.e., altitude, slope, aspect, curvature, plan curvature, profile curvature, roughness, relief amplitude, and distance to rivers, distance to faults, distance to roads, and the stratum lithology) were selected to do correlation analysis after thorough investigation of geological conditions and historical landslides. The results indicate that the outcomes of the three models are fairly consistent.

  14. Analyzing Single-Molecule Protein Transportation Experiments via Hierarchical Hidden Markov Models

    PubMed Central

    Chen, Yang; Shen, Kuang

    2017-01-01

    To maintain proper cellular functions, over 50% of proteins encoded in the genome need to be transported to cellular membranes. The molecular mechanism behind such a process, often referred to as protein targeting, is not well understood. Single-molecule experiments are designed to unveil the detailed mechanisms and reveal the functions of different molecular machineries involved in the process. The experimental data consist of hundreds of stochastic time traces from the fluorescence recordings of the experimental system. We introduce a Bayesian hierarchical model on top of hidden Markov models (HMMs) to analyze these data and use the statistical results to answer the biological questions. In addition to resolving the biological puzzles and delineating the regulating roles of different molecular complexes, our statistical results enable us to propose a more detailed mechanism for the late stages of the protein targeting process. PMID:28943680

  15. Developing and validating a measure of community capacity: Why volunteers make the best neighbours.

    PubMed

    Lovell, Sarah A; Gray, Andrew R; Boucher, Sara E

    2015-05-01

    Social support and community connectedness are key determinants of both mental and physical wellbeing. While social capital has been used to indicate the instrumental value of these social relationships, its broad and often competing definitions have hindered practical applications of the concept. Within the health promotion field, the related concept of community capacity, the ability of a group to identify and act on problems, has gained prominence (Labonte and Laverack, 2001). The goal of this study was to develop and validate a scale measuring community capacity including exploring its associations with socio-demographic and civic behaviour variables among the residents of four small (populations 1500-2000) high-deprivation towns in southern New Zealand. The full (41-item) scale was found to have strong internal consistency (Cronbach's alpha = 0.89) but a process of reducing the scale resulted in a shorter 26-item instrument with similar internal consistency (alpha 0.88). Subscales of the reduced instrument displayed at least marginally acceptable levels of internal consistency (0.62-0.77). Using linear regression models, differences in community capacity scores were found for selected criterion, namely time spent living in the location, local voting, and volunteering behaviour, although the first of these was no longer statistically significant in an adjusted model with potential confounders including age, sex, ethnicity, education, marital status, employment, household income, and religious beliefs. This provides support for the scale's concurrent validity. Differences were present between the four towns in unadjusted models and remained statistically significant in adjusted models (including variables mentioned above) suggesting, crucially, that even when such factors are accounted for, perceptions of one's community may still depend on place. Copyright © 2014. Published by Elsevier Ltd.

  16. Exploring Explanations of Subglacial Bedform Sizes Using Statistical Models

    PubMed Central

    Kougioumtzoglou, Ioannis A.; Stokes, Chris R.; Smith, Michael J.; Clark, Chris D.; Spagnolo, Matteo S.

    2016-01-01

    Sediments beneath modern ice sheets exert a key control on their flow, but are largely inaccessible except through geophysics or boreholes. In contrast, palaeo-ice sheet beds are accessible, and typically characterised by numerous bedforms. However, the interaction between bedforms and ice flow is poorly constrained and it is not clear how bedform sizes might reflect ice flow conditions. To better understand this link we present a first exploration of a variety of statistical models to explain the size distribution of some common subglacial bedforms (i.e., drumlins, ribbed moraine, MSGL). By considering a range of models, constructed to reflect key aspects of the physical processes, it is possible to infer that the size distributions are most effectively explained when the dynamics of ice-water-sediment interaction associated with bedform growth is fundamentally random. A ‘stochastic instability’ (SI) model, which integrates random bedform growth and shrinking through time with exponential growth, is preferred and is consistent with other observations of palaeo-bedforms and geophysical surveys of active ice sheets. Furthermore, we give a proof-of-concept demonstration that our statistical approach can bridge the gap between geomorphological observations and physical models, directly linking measurable size-frequency parameters to properties of ice sheet flow (e.g., ice velocity). Moreover, statistically developing existing models as proposed allows quantitative predictions to be made about sizes, making the models testable; a first illustration of this is given for a hypothesised repeat geophysical survey of bedforms under active ice. Thus, we further demonstrate the potential of size-frequency distributions of subglacial bedforms to assist the elucidation of subglacial processes and better constrain ice sheet models. PMID:27458921

  17. New Developments in the Embedded Statistical Coupling Method: Atomistic/Continuum Crack Propagation

    NASA Technical Reports Server (NTRS)

    Saether, E.; Yamakov, V.; Glaessgen, E.

    2008-01-01

    A concurrent multiscale modeling methodology that embeds a molecular dynamics (MD) region within a finite element (FEM) domain has been enhanced. The concurrent MD-FEM coupling methodology uses statistical averaging of the deformation of the atomistic MD domain to provide interface displacement boundary conditions to the surrounding continuum FEM region, which, in turn, generates interface reaction forces that are applied as piecewise constant traction boundary conditions to the MD domain. The enhancement is based on the addition of molecular dynamics-based cohesive zone model (CZM) elements near the MD-FEM interface. The CZM elements are a continuum interpretation of the traction-displacement relationships taken from MD simulations using Cohesive Zone Volume Elements (CZVE). The addition of CZM elements to the concurrent MD-FEM analysis provides a consistent set of atomistically-based cohesive properties within the finite element region near the growing crack. Another set of CZVEs are then used to extract revised CZM relationships from the enhanced embedded statistical coupling method (ESCM) simulation of an edge crack under uniaxial loading.

  18. A study of two statistical methods as applied to shuttle solid rocket booster expenditures

    NASA Technical Reports Server (NTRS)

    Perlmutter, M.; Huang, Y.; Graves, M.

    1974-01-01

    The state probability technique and the Monte Carlo technique are applied to finding shuttle solid rocket booster expenditure statistics. For a given attrition rate per launch, the probable number of boosters needed for a given mission of 440 launches is calculated. Several cases are considered, including the elimination of the booster after a maximum of 20 consecutive launches. Also considered is the case where the booster is composed of replaceable components with independent attrition rates. A simple cost analysis is carried out to indicate the number of boosters to build initially, depending on booster costs. Two statistical methods were applied in the analysis: (1) state probability method which consists of defining an appropriate state space for the outcome of the random trials, and (2) model simulation method or the Monte Carlo technique. It was found that the model simulation method was easier to formulate while the state probability method required less computing time and was more accurate.

  19. A cortical integrate-and-fire neural network model for blind decoding of visual prosthetic stimulation.

    PubMed

    Eiber, Calvin D; Morley, John W; Lovell, Nigel H; Suaning, Gregg J

    2014-01-01

    We present a computational model of the optic pathway which has been adapted to simulate cortical responses to visual-prosthetic stimulation. This model reproduces the statistically observed distributions of spikes for cortical recordings of sham and maximum-intensity stimuli, while simultaneously generating cellular receptive fields consistent with those observed using traditional visual neuroscience methods. By inverting this model to generate candidate phosphenes which could generate the responses observed to novel stimulation strategies, we hope to aid the development of said strategies in-vivo before being deployed in clinical settings.

  20. Heterogeneous path ensembles for conformational transitions in semi–atomistic models of adenylate kinase

    PubMed Central

    Bhatt, Divesh; Zuckerman, Daniel M.

    2010-01-01

    We performed “weighted ensemble” path–sampling simulations of adenylate kinase, using several semi–atomistic protein models. The models have an all–atom backbone with various levels of residue interactions. The primary result is that full statistically rigorous path sampling required only a few weeks of single–processor computing time with these models, indicating the addition of further chemical detail should be readily feasible. Our semi–atomistic path ensembles are consistent with previous biophysical findings: the presence of two distinct pathways, identification of intermediates, and symmetry of forward and reverse pathways. PMID:21660120

  1. Classification of 'healthier' and 'less healthy' supermarket foods by two Australasian nutrient profiling models.

    PubMed

    Eyles, Helen; Gorton, Delvina; Ni Mhurchu, Cliona

    2010-09-10

    To determine whether a modified version of the Heart Foundation Tick (MHFT) nutrient profiling model appropriately classifies supermarket foods to endorse its use for identifying 'healthier' products eligible for promotion in a supermarket intervention trial. Top-selling products (n=550) were selected from an existing supermarket nutrient composition database. Percentage of products classified as 'healthier' by the MHFT and a modified comparator model (Food Standards Australia New Zealand; MFSANZ) were calculated. Percentage agreement, consistency (kappa statistic), and average nutrient values were assessed overall, and across seven food groups. The MHFT model categorised 16% fewer products as 'healthier' than the MFSANZ model. Agreement and consistency between models were 72% and kappa=0.46 (P=0.00), respectively. For both models, 'healthier' products were on average lower in energy, protein, saturated fat, sugar, and sodium than their 'less healthy' counterparts. The MHFT nutrient profiling model categorised regularly purchased supermarket foods similarly to the MFSANZ model, and both appear to distinguish appropriately between 'healthier' and 'less healthy' options. Therefore, both models have the potential to appropriately identify 'healthier' foods for promotion and positively influence food choices.

  2. Statistical distribution of wind speeds and directions globally observed by NSCAT

    NASA Astrophysics Data System (ADS)

    Ebuchi, Naoto

    1999-05-01

    In order to validate wind vectors derived from the NASA scatterometer (NSCAT), statistical distributions of wind speeds and directions over the global oceans are investigated by comparing with European Centre for Medium-Range Weather Forecasts (ECMWF) wind data. Histograms of wind speeds and directions are calculated from the preliminary and reprocessed NSCAT data products for a period of 8 weeks. For wind speed of the preliminary data products, excessive low wind distribution is pointed out through comparison with ECMWF winds. A hump at the lower wind speed side of the peak in the wind speed histogram is discernible. The shape of the hump varies with incidence angle. Incompleteness of the prelaunch geophysical model function, SASS 2, tentatively used to retrieve wind vectors of the preliminary data products, is considered to cause the skew of the wind speed distribution. On the contrary, histograms of wind speeds of the reprocessed data products show consistent features over the whole range of incidence angles. Frequency distribution of wind directions relative to spacecraft flight direction is calculated to assess self-consistency of the wind directions. It is found that wind vectors of the preliminary data products exhibit systematic directional preference relative to antenna beams. This artificial directivity is also considered to be caused by imperfections in the geophysical model function. The directional distributions of the reprocessed wind vectors show less directivity and consistent features, except for very low wind cases.

  3. Graph theory applied to noise and vibration control in statistical energy analysis models.

    PubMed

    Guasch, Oriol; Cortés, Lluís

    2009-06-01

    A fundamental aspect of noise and vibration control in statistical energy analysis (SEA) models consists in first identifying and then reducing the energy flow paths between subsystems. In this work, it is proposed to make use of some results from graph theory to address both issues. On the one hand, linear and path algebras applied to adjacency matrices of SEA graphs are used to determine the existence of any order paths between subsystems, counting and labeling them, finding extremal paths, or determining the power flow contributions from groups of paths. On the other hand, a strategy is presented that makes use of graph cut algorithms to reduce the energy flow from a source subsystem to a receiver one, modifying as few internal and coupling loss factors as possible.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    More, R.M.

    A new statistical model (the quantum-statistical model (QSM)) was recently introduced by Kalitkin and Kuzmina for the calculation of thermodynamic properties of compressed matter. This paper examines the QSM and gives (i) a numerical QSM calculation of pressure and energy for aluminum and comparison to existing augmented-plane-wave data; (ii) display of separate kinetic, exchange, and quantum pressure terms; (iii) a study of electron density at the nucleus; (iv) a study of the effects of the Kirzhnitz-Weizsacker parameter controlling the gradient terms; (v) an analytic expansion for very high densities; and (vi) rigorous pressure theorems including a general version of themore » virial theorem which applies to an arbitrary microscopic volume. It is concluded that the QSM represents the most accurate and consistent theory of the Thomas-Fermi type.« less

  5. MSSM-inspired multifield inflation

    NASA Astrophysics Data System (ADS)

    Dubinin, M. N.; Petrova, E. Yu.; Pozdeeva, E. O.; Sumin, M. V.; Vernov, S. Yu.

    2017-12-01

    Despite the fact that experimentally with a high degree of statistical significance only a single Standard Model-like Higgs boson is discovered at the LHC, extended Higgs sectors with multiple scalar fields not excluded by combined fits of the data are more preferable theoretically for internally consistent realistic models of particle physics. We analyze the inflationary scenarios which could be induced by the two-Higgs-doublet potential of the Minimal Supersymmetric Standard Model (MSSM) where five scalar fields have non-minimal couplings to gravity. Observables following from such MSSM-inspired multifield inflation are calculated and a number of consistent inflationary scenarios are constructed. Cosmological evolution with different initial conditions for the multifield system leads to consequences fully compatible with observational data on the spectral index and the tensor-to-scalar ratio. It is demonstrated that the strong coupling approximation is precise enough to describe such inflationary scenarios.

  6. Composite load spectra for select space propulsion structural components

    NASA Technical Reports Server (NTRS)

    Newell, J. F.; Kurth, R. E.; Ho, H.

    1991-01-01

    The objective of this program is to develop generic load models with multiple levels of progressive sophistication to simulate the composite (combined) load spectra that are induced in space propulsion system components, representative of Space Shuttle Main Engines (SSME), such as transfer ducts, turbine blades, and liquid oxygen posts and system ducting. The first approach will consist of using state of the art probabilistic methods to describe the individual loading conditions and combinations of these loading conditions to synthesize the composite load spectra simulation. The second approach will consist of developing coupled models for composite load spectra simulation which combine the deterministic models for composite load dynamic, acoustic, high pressure, and high rotational speed, etc., load simulation using statistically varying coefficients. These coefficients will then be determined using advanced probabilistic simulation methods with and without strategically selected experimental data.

  7. Exploring the calibration of a wind forecast ensemble for energy applications

    NASA Astrophysics Data System (ADS)

    Heppelmann, Tobias; Ben Bouallegue, Zied; Theis, Susanne

    2015-04-01

    In the German research project EWeLiNE, Deutscher Wetterdienst (DWD) and Fraunhofer Institute for Wind Energy and Energy System Technology (IWES) are collaborating with three German Transmission System Operators (TSO) in order to provide the TSOs with improved probabilistic power forecasts. Probabilistic power forecasts are derived from probabilistic weather forecasts, themselves derived from ensemble prediction systems (EPS). Since the considered raw ensemble wind forecasts suffer from underdispersiveness and bias, calibration methods are developed for the correction of the model bias and the ensemble spread bias. The overall aim is to improve the ensemble forecasts such that the uncertainty of the possible weather deployment is depicted by the ensemble spread from the first forecast hours. Additionally, the ensemble members after calibration should remain physically consistent scenarios. We focus on probabilistic hourly wind forecasts with horizon of 21 h delivered by the convection permitting high-resolution ensemble system COSMO-DE-EPS which has become operational in 2012 at DWD. The ensemble consists of 20 ensemble members driven by four different global models. The model area includes whole Germany and parts of Central Europe with a horizontal resolution of 2.8 km and a vertical resolution of 50 model levels. For verification we use wind mast measurements around 100 m height that corresponds to the hub height of wind energy plants that belong to wind farms within the model area. Calibration of the ensemble forecasts can be performed by different statistical methods applied to the raw ensemble output. Here, we explore local bivariate Ensemble Model Output Statistics at individual sites and quantile regression with different predictors. Applying different methods, we already show an improvement of ensemble wind forecasts from COSMO-DE-EPS for energy applications. In addition, an ensemble copula coupling approach transfers the time-dependencies of the raw ensemble to the calibrated ensemble. The calibrated wind forecasts are evaluated first with univariate probabilistic scores and additionally with diagnostics of wind ramps in order to assess the time-consistency of the calibrated ensemble members.

  8. Why the Long Face? The Mechanics of Mandibular Symphysis Proportions in Crocodiles

    PubMed Central

    Walmsley, Christopher W.; Smits, Peter D.; Quayle, Michelle R.; McCurry, Matthew R.; Richards, Heather S.; Oldfield, Christopher C.; Wroe, Stephen; Clausen, Phillip D.; McHenry, Colin R.

    2013-01-01

    Background Crocodilians exhibit a spectrum of rostral shape from long snouted (longirostrine), through to short snouted (brevirostrine) morphologies. The proportional length of the mandibular symphysis correlates consistently with rostral shape, forming as much as 50% of the mandible’s length in longirostrine forms, but 10% in brevirostrine crocodilians. Here we analyse the structural consequences of an elongate mandibular symphysis in relation to feeding behaviours. Methods/Principal Findings Simple beam and high resolution Finite Element (FE) models of seven species of crocodile were analysed under loads simulating biting, shaking and twisting. Using beam theory, we statistically compared multiple hypotheses of which morphological variables should control the biomechanical response. Brevi- and mesorostrine morphologies were found to consistently outperform longirostrine types when subject to equivalent biting, shaking and twisting loads. The best predictors of performance for biting and twisting loads in FE models were overall length and symphyseal length respectively; for shaking loads symphyseal length and a multivariate measurement of shape (PC1– which is strongly but not exclusively correlated with symphyseal length) were equally good predictors. Linear measurements were better predictors than multivariate measurements of shape in biting and twisting loads. For both biting and shaking loads but not for twisting, simple beam models agree with best performance predictors in FE models. Conclusions/Significance Combining beam and FE modelling allows a priori hypotheses about the importance of morphological traits on biomechanics to be statistically tested. Short mandibular symphyses perform well under loads used for feeding upon large prey, but elongate symphyses incur high strains under equivalent loads, underlining the structural constraints to prey size in the longirostrine morphotype. The biomechanics of the crocodilian mandible are largely consistent with beam theory and can be predicted from simple morphological measurements, suggesting that crocodilians are a useful model for investigating the palaeobiomechanics of other aquatic tetrapods. PMID:23342027

  9. On the Convenience of Using the Complete Linearization Method in Modelling the BLR of AGN

    NASA Astrophysics Data System (ADS)

    Patriarchi, P.; Perinotto, M.

    The Complete Linearization Method (Mihalas, 1978) consists in the determination of the radiation field (at a set of frequency points), atomic level populations, temperature, electron density etc., by resolving the system of radiative transfer, thermal equilibrium, statistical equilibrium equations simultaneously and self-consistently. Since the system is not linear, it must be solved by iteration after linearization, using a perturbative method, starting from an initial guess solution. Of course the Complete Linearization Method is more time consuming than the previous one. But how great can this disadvantage be in the age of supercomputers? It is possible to approximately evaluate the CPU time needed to run a model by computing the number of multiplications necessary to solve the system.

  10. Operational seasonal and interannual predictions of ocean conditions

    NASA Technical Reports Server (NTRS)

    Leetmaa, Ants

    1992-01-01

    Dr. Leetmaa described current work at the U.S. National Meteorological Center (NMC) on coupled systems leading to a seasonal prediction system. He described the way in which ocean thermal data is quality controlled and used in a four dimensional data assimilation system. This consists of a statistical interpolation scheme, a primitive equation ocean general circulation model, and the atmospheric fluxes that are required to force this. This whole process generated dynamically consist thermohaline and velocity fields for the ocean. Currently routine weekly analyses are performed for the Atlantic and Pacific oceans. These analyses are used for ocean climate diagnostics and as initial conditions for coupled forecast models. Specific examples of output products were shown both in the Pacific and the Atlantic Ocean.

  11. Receptor arrays optimized for natural odor statistics.

    PubMed

    Zwicker, David; Murugan, Arvind; Brenner, Michael P

    2016-05-17

    Natural odors typically consist of many molecules at different concentrations. It is unclear how the numerous odorant molecules and their possible mixtures are discriminated by relatively few olfactory receptors. Using an information theoretic model, we show that a receptor array is optimal for this task if it achieves two possibly conflicting goals: (i) Each receptor should respond to half of all odors and (ii) the response of different receptors should be uncorrelated when averaged over odors presented with natural statistics. We use these design principles to predict statistics of the affinities between receptors and odorant molecules for a broad class of odor statistics. We also show that optimal receptor arrays can be tuned to either resolve concentrations well or distinguish mixtures reliably. Finally, we use our results to predict properties of experimentally measured receptor arrays. Our work can thus be used to better understand natural olfaction, and it also suggests ways to improve artificial sensor arrays.

  12. A model and variance reduction method for computing statistical outputs of stochastic elliptic partial differential equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vidal-Codina, F., E-mail: fvidal@mit.edu; Nguyen, N.C., E-mail: cuongng@mit.edu; Giles, M.B., E-mail: mike.giles@maths.ox.ac.uk

    We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basismore » approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.« less

  13. Light propagation in Swiss-cheese models of random close-packed Szekeres structures: Effects of anisotropy and comparisons with perturbative results

    NASA Astrophysics Data System (ADS)

    Koksbang, S. M.

    2017-03-01

    Light propagation in two Swiss-cheese models based on anisotropic Szekeres structures is studied and compared with light propagation in Swiss-cheese models based on the Szekeres models' underlying Lemaitre-Tolman-Bondi models. The study shows that the anisotropy of the Szekeres models has only a small effect on quantities such as redshift-distance relations, projected shear and expansion rate along individual light rays. The average angular diameter distance to the last scattering surface is computed for each model. Contrary to earlier studies, the results obtained here are (mostly) in agreement with perturbative results. In particular, a small negative shift, δ DA≔D/A-DA ,b g DA ,b g , in the angular diameter distance is obtained upon line-of-sight averaging in three of the four models. The results are, however, not statistically significant. In the fourth model, there is a small positive shift which has an especially small statistical significance. The line-of-sight averaged inverse magnification at z =1100 is consistent with 1 to a high level of confidence for all models, indicating that the area of the surface corresponding to z =1100 is close to that of the background.

  14. Analysis of the statistical thermodynamic model for nonlinear binary protein adsorption equilibria.

    PubMed

    Zhou, Xiao-Peng; Su, Xue-Li; Sun, Yan

    2007-01-01

    The statistical thermodynamic (ST) model was used to study nonlinear binary protein adsorption equilibria on an anion exchanger. Single-component and binary protein adsorption isotherms of bovine hemoglobin (Hb) and bovine serum albumin (BSA) on DEAE Spherodex M were determined by batch adsorption experiments in 10 mM Tris-HCl buffer containing a specific NaCl concentration (0.05, 0.10, and 0.15 M) at pH 7.40. The ST model was found to depict the effect of ionic strength on the single-component equilibria well, with model parameters depending on ionic strength. Moreover, the ST model gave acceptable fitting to the binary adsorption data with the fitted single-component model parameters, leading to the estimation of the binary ST model parameter. The effects of ionic strength on the model parameters are reasonably interpreted by the electrostatic and thermodynamic theories. The effective charge of protein in adsorption phase can be separately calculated from the two categories of the model parameters, and the values obtained from the two methods are consistent. The results demonstrate the utility of the ST model for describing nonlinear binary protein adsorption equilibria.

  15. Which Type of Risk Information to Use for Whom? Moderating Role of Outcome-Relevant Involvement in the Effects of Statistical and Exemplified Risk Information on Risk Perceptions.

    PubMed

    So, Jiyeon; Jeong, Se-Hoon; Hwang, Yoori

    2017-04-01

    The extant empirical research examining the effectiveness of statistical and exemplar-based health information is largely inconsistent. Under the premise that the inconsistency may be due to an unacknowledged moderator (O'Keefe, 2002), this study examined a moderating role of outcome-relevant involvement (Johnson & Eagly, 1989) in the effects of statistical and exemplified risk information on risk perception. Consistent with predictions based on elaboration likelihood model (Petty & Cacioppo, 1984), findings from an experiment (N = 237) concerning alcohol consumption risks showed that statistical risk information predicted risk perceptions of individuals with high, rather than low, involvement, while exemplified risk information predicted risk perceptions of those with low, rather than high, involvement. Moreover, statistical risk information contributed to negative attitude toward drinking via increased risk perception only for highly involved individuals, while exemplified risk information influenced the attitude through the same mechanism only for individuals with low involvement. Theoretical and practical implications for health risk communication are discussed.

  16. A self-consistency approach to improve microwave rainfall rate estimation from space

    NASA Technical Reports Server (NTRS)

    Kummerow, Christian; Mack, Robert A.; Hakkarinen, Ida M.

    1989-01-01

    A multichannel statistical approach is used to retrieve rainfall rates from the brightness temperature T(B) observed by passive microwave radiometers flown on a high-altitude NASA aircraft. T(B) statistics are based upon data generated by a cloud radiative model. This model simulates variabilities in the underlying geophysical parameters of interest, and computes their associated T(B) in each of the available channels. By further imposing the requirement that the observed T(B) agree with the T(B) values corresponding to the retrieved parameters through the cloud radiative transfer model, the results can be made to agree quite well with coincident radar-derived rainfall rates. Some information regarding the cloud vertical structure is also obtained by such an added requirement. The applicability of this technique to satellite retrievals is also investigated. Data which might be observed by satellite-borne radiometers, including the effects of nonuniformly filled footprints, are simulated by the cloud radiative model for this purpose.

  17. The MSFC Solar Activity Future Estimation (MSAFE) Model

    NASA Technical Reports Server (NTRS)

    Suggs, Ron

    2017-01-01

    The Natural Environments Branch of the Engineering Directorate at Marshall Space Flight Center (MSFC) provides solar cycle forecasts for NASA space flight programs and the aerospace community. These forecasts provide future statistical estimates of sunspot number, solar radio 10.7 cm flux (F10.7), and the geomagnetic planetary index, Ap, for input to various space environment models. For example, many thermosphere density computer models used in spacecraft operations, orbital lifetime analysis, and the planning of future spacecraft missions require as inputs the F10.7 and Ap. The solar forecast is updated each month by executing MSAFE using historical and the latest month's observed solar indices to provide estimates for the balance of the current solar cycle. The forecasted solar indices represent the 13-month smoothed values consisting of a best estimate value stated as a 50 percentile value along with approximate +/- 2 sigma values stated as 95 and 5 percentile statistical values. This presentation will give an overview of the MSAFE model and the forecast for the current solar cycle.

  18. Feature maps driven no-reference image quality prediction of authentically distorted images

    NASA Astrophysics Data System (ADS)

    Ghadiyaram, Deepti; Bovik, Alan C.

    2015-03-01

    Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.

  19. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE PAGES

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent; ...

    2018-03-02

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  20. Comparison of response surface methodology and artificial neural network to enhance the release of reducing sugars from non-edible seed cake by autoclave assisted HCl hydrolysis.

    PubMed

    Shet, Vinayaka B; Palan, Anusha M; Rao, Shama U; Varun, C; Aishwarya, Uday; Raja, Selvaraj; Goveas, Louella Concepta; Vaman Rao, C; Ujwal, P

    2018-02-01

    In the current investigation, statistical approaches were adopted to hydrolyse non-edible seed cake (NESC) of Pongamia and optimize the hydrolysis process by response surface methodology (RSM). Through the RSM approach, the optimized conditions were found to be 1.17%v/v of HCl concentration at 54.12 min for hydrolysis. Under optimized conditions, the release of reducing sugars was found to be 53.03 g/L. The RSM data were used to train the artificial neural network (ANN) and the predictive ability of both models was compared by calculating various statistical parameters. A three-layered ANN model consisting of 2:12:1 topology was developed; the response of the ANN model indicates that it is precise when compared with the RSM model. The fit of the models was expressed with the regression coefficient R 2 , which was found to be 0.975 and 0.888, respectively, for the ANN and RSM models. This further demonstrated that the performance of ANN was better than that of RSM.

  1. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  2. Parameter inference in small world network disease models with approximate Bayesian Computational methods

    NASA Astrophysics Data System (ADS)

    Walker, David M.; Allingham, David; Lee, Heung Wing Joseph; Small, Michael

    2010-02-01

    Small world network models have been effective in capturing the variable behaviour of reported case data of the SARS coronavirus outbreak in Hong Kong during 2003. Simulations of these models have previously been realized using informed “guesses” of the proposed model parameters and tested for consistency with the reported data by surrogate analysis. In this paper we attempt to provide statistically rigorous parameter distributions using Approximate Bayesian Computation sampling methods. We find that such sampling schemes are a useful framework for fitting parameters of stochastic small world network models where simulation of the system is straightforward but expressing a likelihood is cumbersome.

  3. Quantifying geological uncertainty for flow and transport modeling in multi-modal heterogeneous formations

    NASA Astrophysics Data System (ADS)

    Feyen, Luc; Caers, Jef

    2006-06-01

    In this work, we address the problem of characterizing the heterogeneity and uncertainty of hydraulic properties for complex geological settings. Hereby, we distinguish between two scales of heterogeneity, namely the hydrofacies structure and the intrafacies variability of the hydraulic properties. We employ multiple-point geostatistics to characterize the hydrofacies architecture. The multiple-point statistics are borrowed from a training image that is designed to reflect the prior geological conceptualization. The intrafacies variability of the hydraulic properties is represented using conventional two-point correlation methods, more precisely, spatial covariance models under a multi-Gaussian spatial law. We address the different levels and sources of uncertainty in characterizing the subsurface heterogeneity, and explore their effect on groundwater flow and transport predictions. Typically, uncertainty is assessed by way of many images, termed realizations, of a fixed statistical model. However, in many cases, sampling from a fixed stochastic model does not adequately represent the space of uncertainty. It neglects the uncertainty related to the selection of the stochastic model and the estimation of its input parameters. We acknowledge the uncertainty inherent in the definition of the prior conceptual model of aquifer architecture and in the estimation of global statistics, anisotropy, and correlation scales. Spatial bootstrap is used to assess the uncertainty of the unknown statistical parameters. As an illustrative example, we employ a synthetic field that represents a fluvial setting consisting of an interconnected network of channel sands embedded within finer-grained floodplain material. For this highly non-stationary setting we quantify the groundwater flow and transport model prediction uncertainty for various levels of hydrogeological uncertainty. Results indicate the importance of accurately describing the facies geometry, especially for transport predictions.

  4. Predicting survival of Escherichia coli O157:H7 in dry fermented sausage using artificial neural networks.

    PubMed

    Palanichamy, A; Jayas, D S; Holley, R A

    2008-01-01

    The Canadian Food Inspection Agency required the meat industry to ensure Escherichia coli O157:H7 does not survive (experiences > or = 5 log CFU/g reduction) in dry fermented sausage (salami) during processing after a series of foodborne illness outbreaks resulting from this pathogenic bacterium occurred. The industry is in need of an effective technique like predictive modeling for estimating bacterial viability, because traditional microbiological enumeration is a time-consuming and laborious method. The accuracy and speed of artificial neural networks (ANNs) for this purpose is an attractive alternative (developed from predictive microbiology), especially for on-line processing in industry. Data from a study of interactive effects of different levels of pH, water activity, and the concentrations of allyl isothiocyanate at various times during sausage manufacture in reducing numbers of E. coli O157:H7 were collected. Data were used to develop predictive models using a general regression neural network (GRNN), a form of ANN, and a statistical linear polynomial regression technique. Both models were compared for their predictive error, using various statistical indices. GRNN predictions for training and test data sets had less serious errors when compared with the statistical model predictions. GRNN models were better and slightly better for training and test sets, respectively, than was the statistical model. Also, GRNN accurately predicted the level of allyl isothiocyanate required, ensuring a 5-log reduction, when an appropriate production set was created by interpolation. Because they are simple to generate, fast, and accurate, ANN models may be of value for industrial use in dry fermented sausage manufacture to reduce the hazard associated with E. coli O157:H7 in fresh beef and permit production of consistently safe products from this raw material.

  5. Testing a Nursing-Specific Model of Electronic Patient Record documentation with regard to information completeness, comprehensiveness and consistency.

    PubMed

    von Krogh, Gunn; Nåden, Dagfinn; Aasland, Olaf Gjerløw

    2012-10-01

    To present the results from the test site application of the documentation model KPO (quality assurance, problem solving and caring) designed to impact the quality of nursing information in electronic patient record (EPR). The KPO model was developed by means of consensus group and clinical testing. Four documentation arenas and eight content categories, nursing terminologies and a decision-support system were designed to impact the completeness, comprehensiveness and consistency of nursing information. The testing was performed in a pre-test/post-test time series design, three times at a one-year interval. Content analysis of nursing documentation was accomplished through the identification, interpretation and coding of information units. Data from the pre-test and post-test 2 were subjected to statistical analyses. To estimate the differences, paired t-tests were used. At post-test 2, the information is found to be more complete, comprehensive and consistent than at pre-test. The findings indicate that documentation arenas combining work flow and content categories deduced from theories on nursing practice can influence the quality of nursing information. The KPO model can be used as guide when shifting from paper-based to electronic-based nursing documentation with the aim of obtaining complete, comprehensive and consistent nursing information. © 2012 Blackwell Publishing Ltd.

  6. Integrated cosmological probes: concordance quantified

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nicola, Andrina; Amara, Adam; Refregier, Alexandre, E-mail: andrina.nicola@phys.ethz.ch, E-mail: adam.amara@phys.ethz.ch, E-mail: alexandre.refregier@phys.ethz.ch

    2017-10-01

    Assessing the consistency of parameter constraints derived from different cosmological probes is an important way to test the validity of the underlying cosmological model. In an earlier work [1], we computed constraints on cosmological parameters for ΛCDM from an integrated analysis of CMB temperature anisotropies and CMB lensing from Planck, galaxy clustering and weak lensing from SDSS, weak lensing from DES SV as well as Type Ia supernovae and Hubble parameter measurements. In this work, we extend this analysis and quantify the concordance between the derived constraints and those derived by the Planck Collaboration as well as WMAP9, SPT andmore » ACT. As a measure for consistency, we use the Surprise statistic [2], which is based on the relative entropy. In the framework of a flat ΛCDM cosmological model, we find all data sets to be consistent with one another at a level of less than 1σ. We highlight that the relative entropy is sensitive to inconsistencies in the models that are used in different parts of the analysis. In particular, inconsistent assumptions for the neutrino mass break its invariance on the parameter choice. When consistent model assumptions are used, the data sets considered in this work all agree with each other and ΛCDM, without evidence for tensions.« less

  7. Benchmarking the mesoscale variability in global ocean eddy-permitting numerical systems

    NASA Astrophysics Data System (ADS)

    Cipollone, Andrea; Masina, Simona; Storto, Andrea; Iovino, Doroteaciro

    2017-10-01

    The role of data assimilation procedures on representing ocean mesoscale variability is assessed by applying eddy statistics to a state-of-the-art global ocean reanalysis (C-GLORS), a free global ocean simulation (performed with the NEMO system) and an observation-based dataset (ARMOR3D) used as an independent benchmark. Numerical results are computed on a 1/4 ∘ horizontal grid (ORCA025) and share the same resolution with ARMOR3D dataset. This "eddy-permitting" resolution is sufficient to allow ocean eddies to form. Further to assessing the eddy statistics from three different datasets, a global three-dimensional eddy detection system is implemented in order to bypass the need of regional-dependent definition of thresholds, typical of commonly adopted eddy detection algorithms. It thus provides full three-dimensional eddy statistics segmenting vertical profiles from local rotational velocities. This criterion is crucial for discerning real eddies from transient surface noise that inevitably affects any two-dimensional algorithm. Data assimilation enhances and corrects mesoscale variability on a wide range of features that cannot be well reproduced otherwise. The free simulation fairly reproduces eddies emerging from western boundary currents and deep baroclinic instabilities, while underestimates shallower vortexes that populate the full basin. The ocean reanalysis recovers most of the missing turbulence, shown by satellite products , that is not generated by the model itself and consistently projects surface variability deep into the water column. The comparison with the statistically reconstructed vertical profiles from ARMOR3D show that ocean data assimilation is able to embed variability into the model dynamics, constraining eddies with in situ and altimetry observation and generating them consistently with local environment.

  8. Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

    PubMed

    Yang, Xiaowei; Nie, Kun

    2008-03-15

    Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.

  9. High resolution tempo-spatial ozone prediction with SVM and LSTM

    NASA Astrophysics Data System (ADS)

    Gao, D.; Zhang, Y.; Qu, Z.; Sadighi, K.; Coffey, E.; LIU, Q.; Hannigan, M.; Henze, D. K.; Dick, R.; Shang, L.; Lv, Q.

    2017-12-01

    To investigate and predict the exposure of ozone and other pollutants in urban areas, we utilize data from various infrastructures including EPA, NOAA and RIITS from government of Los Angeles and construct statistical models to conduct ozone concentration prediction in Los Angeles areas at finer spatial and temporal granularity. Our work involves cyber data such as traffic, roads and population data as features for prediction. Two statistical models, Support Vector Machine (SVM) and Long Short-term Memory (LSTM, deep learning method) are used for prediction. . Our experiments show that kernelized SVM gains better prediction performance when taking traffic counts, road density and population density as features, with a prediction RMSE of 7.99 ppb for all-time ozone and 6.92 ppb for peak-value ozone. With simulated NOx from Chemical Transport Model(CTM) as features, SVM generates even better prediction performance, with a prediction RMSE of 6.69ppb. We also build LSTM, which has shown great advantages at dealing with temporal sequences, to predict ozone concentration by treating ozone concentration as spatial-temporal sequences. Trained by ozone concentration measurements from the 13 EPA stations in LA area, the model achieves 4.45 ppb RMSE. Besides, we build a variant of this model which adds spatial dynamics into the model in the form of transition matrix that reveals new knowledge on pollutant transition. The forgetting gate of the trained LSTM is consistent with the delay effect of ozone concentration and the trained transition matrix shows spatial consistency with the common direction of winds in LA area.

  10. Summary of hydrologic modeling for the Delaware River Basin using the Water Availability Tool for Environmental Resources (WATER)

    USGS Publications Warehouse

    Williamson, Tanja N.; Lant, Jeremiah G.; Claggett, Peter; Nystrom, Elizabeth A.; Milly, Paul C.D.; Nelson, Hugh L.; Hoffman, Scott A.; Colarullo, Susan J.; Fischer, Jeffrey M.

    2015-11-18

    The Water Availability Tool for Environmental Resources (WATER) is a decision support system for the nontidal part of the Delaware River Basin that provides a consistent and objective method of simulating streamflow under historical, forecasted, and managed conditions. In order to quantify the uncertainty associated with these simulations, however, streamflow and the associated hydroclimatic variables of potential evapotranspiration, actual evapotranspiration, and snow accumulation and snowmelt must be simulated and compared to long-term, daily observations from sites. This report details model development and optimization, statistical evaluation of simulations for 57 basins ranging from 2 to 930 km2 and 11.0 to 99.5 percent forested cover, and how this statistical evaluation of daily streamflow relates to simulating environmental changes and management decisions that are best examined at monthly time steps normalized over multiple decades. The decision support system provides a database of historical spatial and climatic data for simulating streamflow for 2001–11, in addition to land-cover and general circulation model forecasts that focus on 2030 and 2060. WATER integrates geospatial sampling of landscape characteristics, including topographic and soil properties, with a regionally calibrated hillslope-hydrology model, an impervious-surface model, and hydroclimatic models that were parameterized by using three hydrologic response units: forested, agricultural, and developed land cover. This integration enables the regional hydrologic modeling approach used in WATER without requiring site-specific optimization or those stationary conditions inferred when using a statistical model.

  11. Climate Projections from the NARCliM Project: Bayesian Model Averaging of Maximum Temperature Projections

    NASA Astrophysics Data System (ADS)

    Olson, R.; Evans, J. P.; Fan, Y.

    2015-12-01

    NARCliM (NSW/ACT Regional Climate Modelling Project) is a regional climate project for Australia and the surrounding region. It dynamically downscales 4 General Circulation Models (GCMs) using three Regional Climate Models (RCMs) to provide climate projections for the CORDEX-AustralAsia region at 50 km resolution, and for south-east Australia at 10 km resolution. The project differs from previous work in the level of sophistication of model selection. Specifically, the selection process for GCMs included (i) conducting literature review to evaluate model performance, (ii) analysing model independence, and (iii) selecting models that span future temperature and precipitation change space. RCMs for downscaling the GCMs were chosen based on their performance for several precipitation events over South-East Australia, and on model independence.Bayesian Model Averaging (BMA) provides a statistically consistent framework for weighing the models based on their likelihood given the available observations. These weights are used to provide probability distribution functions (pdfs) for model projections. We develop a BMA framework for constructing probabilistic climate projections for spatially-averaged variables from the NARCliM project. The first step in the procedure is smoothing model output in order to exclude the influence of internal climate variability. Our statistical model for model-observations residuals is a homoskedastic iid process. Comparing RCMs with Australian Water Availability Project (AWAP) observations is used to determine model weights through Monte Carlo integration. Posterior pdfs of statistical parameters of model-data residuals are obtained using Markov Chain Monte Carlo. The uncertainty in the properties of the model-data residuals is fully accounted for when constructing the projections. We present the preliminary results of the BMA analysis for yearly maximum temperature for New South Wales state planning regions for the period 2060-2079.

  12. Model based estimates of long-term persistence of inactivated hepatitis A vaccine-induced antibodies in adults.

    PubMed

    Hens, Niel; Habteab Ghebretinsae, Aklilu; Hardt, Karin; Van Damme, Pierre; Van Herck, Koen

    2014-03-14

    In this paper, we review the results of existing statistical models of the long-term persistence of hepatitis A vaccine-induced antibodies in light of recently available immunogenicity data from 2 clinical trials (up to 17 years of follow-up). Healthy adult volunteers monitored annually for 17 years after the administration of the first vaccine dose in 2 double-blind, randomized clinical trials were included in this analysis. Vaccination in these studies was administered according to a 2-dose vaccination schedule: 0, 12 months in study A and 0, 6 months in study B (NCT00289757/NCT00291876). Antibodies were measured using an in-house ELISA during the first 11 years of follow-up; a commercially available ELISA was then used up to Year 17 of follow-up. Long-term antibody persistence from studies A and B was estimated using statistical models for longitudinal data. Data from studies A and B were modeled separately. A total of 173 participants in study A and 108 participants in study B were included in the analysis. A linear mixed model with 2 changepoints allowed all available results to be accounted for. Predictions based on this model indicated that 98% (95%CI: 94-100%) of participants in study A and 97% (95%CI: 94-100%) of participants in study B will remain seropositive 25 years after receiving the first vaccine dose. Other models using part of the data provided consistent results: ≥95% of the participants was projected to remain seropositive for ≥25 years. This analysis, using previously used and newly selected model structures, was consistent with former estimates of seropositivity rates ≥95% for at least 25 years. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Fast mean and variance computation of the diffuse sound transmission through finite-sized thick and layered wall and floor systems

    NASA Astrophysics Data System (ADS)

    Decraene, Carolina; Dijckmans, Arne; Reynders, Edwin P. B.

    2018-05-01

    A method is developed for computing the mean and variance of the diffuse field sound transmission loss of finite-sized layered wall and floor systems that consist of solid, fluid and/or poroelastic layers. This is achieved by coupling a transfer matrix model of the wall or floor to statistical energy analysis subsystem models of the adjacent room volumes. The modal behavior of the wall is approximately accounted for by projecting the wall displacement onto a set of sinusoidal lateral basis functions. This hybrid modal transfer matrix-statistical energy analysis method is validated on multiple wall systems: a thin steel plate, a polymethyl methacrylate panel, a thick brick wall, a sandwich panel, a double-leaf wall with poro-elastic material in the cavity, and a double glazing. The predictions are compared with experimental data and with results obtained using alternative prediction methods such as the transfer matrix method with spatial windowing, the hybrid wave based-transfer matrix method, and the hybrid finite element-statistical energy analysis method. These comparisons confirm the prediction accuracy of the proposed method and the computational efficiency against the conventional hybrid finite element-statistical energy analysis method.

  14. EQS Goes R: Simulations for SEM Using the Package REQS

    ERIC Educational Resources Information Center

    Mair, Patrick; Wu, Eric; Bentler, Peter M.

    2010-01-01

    The REQS package is an interface between the R environment of statistical computing and the EQS software for structural equation modeling. The package consists of 3 main functions that read EQS script files and import the results into R, call EQS script files from R, and run EQS script files from R and import the results after EQS computations.…

  15. Bose condensation of nuclei in heavy ion collisions

    NASA Technical Reports Server (NTRS)

    Tripathi, Ram K.; Townsend, Lawrence W.

    1994-01-01

    Using a fully self-consistent quantum statistical model, we demonstrate the possibility of Bose condensation of nuclei in heavy ion collisions. The most favorable conditions of high densities and low temperatures are usually associated with astrophysical processes and may be difficult to achieve in heavy ion collisions. Nonetheless, some suggestions for the possible experimental verification of the existence of this phenomenon are made.

  16. Teachers' and Mothers' Assessment of Social Skills of Students with Mental Retardation

    ERIC Educational Resources Information Center

    Cifci Tekinarslan, Ilknur; Sazak Pinar, Elif; Sucuoglu, Bulbin

    2012-01-01

    The purpose of this study is to compare the assessment results of social skills of students with mental retardation by their teachers and mothers through relational model by using descriptive statistics. The research group in this study consisted of mothers and teachers of 562 children with mental retardation aged between 6 and 12 who enrolled in…

  17. Statistical methods for efficient design of community surveys of response to noise: Random coefficients regression models

    NASA Technical Reports Server (NTRS)

    Tomberlin, T. J.

    1985-01-01

    Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.

  18. Developing Statistical Physics Course Handout on Distribution Function Materials Based on Science, Technology, Engineering, and Mathematics

    NASA Astrophysics Data System (ADS)

    Riandry, M. A.; Ismet, I.; Akhsan, H.

    2017-09-01

    This study aims to produce a valid and practical statistical physics course handout on distribution function materials based on STEM. Rowntree development model is used to produce this handout. The model consists of three stages: planning, development and evaluation stages. In this study, the evaluation stage used Tessmer formative evaluation. It consists of 5 stages: self-evaluation, expert review, one-to-one evaluation, small group evaluation and field test stages. However, the handout is limited to be tested on validity and practicality aspects, so the field test stage is not implemented. The data collection technique used walkthroughs and questionnaires. Subjects of this study are students of 6th and 8th semester of academic year 2016/2017 Physics Education Study Program of Sriwijaya University. The average result of expert review is 87.31% (very valid category). One-to-one evaluation obtained the average result is 89.42%. The result of small group evaluation is 85.92%. From one-to-one and small group evaluation stages, averagestudent response to this handout is 87,67% (very practical category). Based on the results of the study, it can be concluded that the handout is valid and practical.

  19. Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments

    NASA Astrophysics Data System (ADS)

    Fisher, W. P., Jr.; Elbaum, B.; Coulter, A.

    2010-07-01

    Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.

  20. MTS dye based colorimetric CTLL-2 cell proliferation assay for product release and stability monitoring of interleukin-15: assay qualification, standardization and statistical analysis.

    PubMed

    Soman, Gopalan; Yang, Xiaoyi; Jiang, Hengguang; Giardina, Steve; Vyas, Vinay; Mitra, George; Yovandich, Jason; Creekmore, Stephen P; Waldmann, Thomas A; Quiñones, Octavio; Alvord, W Gregory

    2009-08-31

    A colorimetric cell proliferation assay using soluble tetrazolium salt [(CellTiter 96(R) Aqueous One Solution) cell proliferation reagent, containing the (3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium, inner salt) and an electron coupling reagent phenazine ethosulfate], was optimized and qualified for quantitative determination of IL-15 dependent CTLL-2 cell proliferation activity. An in-house recombinant Human (rHu)IL-15 reference lot was standardized (IU/mg) against an international reference standard. Specificity of the assay for IL-15 was documented by illustrating the ability of neutralizing anti-IL-15 antibodies to block the product specific CTLL-2 cell proliferation and the lack of blocking effect with anti-IL-2 antibodies. Under the defined assay conditions, the linear dose-response concentration range was between 0.04 and 0.17ng/ml of the rHuIL-15 produced in-house and 0.5-3.0IU/ml for the international standard. Statistical analysis of the data was performed with the use of scripts written in the R Statistical Language and Environment utilizing a four-parameter logistic regression fit analysis procedure. The overall variation in the ED(50) values for the in-house reference standard from 55 independent estimates performed over the period of 1year was 12.3% of the average. Excellent intra-plate and within-day/inter-plate consistency was observed for all four parameter estimates in the model. Different preparations of rHuIL-15 showed excellent intra-plate consistency in the parameter estimates corresponding to the lower and upper asymptotes as well as to the 'slope' factor at the mid-point. The ED(50) values showed statistically significant differences for different lots and for control versus stressed samples. Three R-scripts improve data analysis capabilities allowing one to describe assay variations, to draw inferences between data sets from formal statistical tests, and to set up improved assay acceptance criteria based on comparability and consistency in the four parameters of the model. The assay is precise, accurate and robust and can be fully validated. Applications of the assay were established including process development support, release of the rHuIL-15 product for pre-clinical and clinical studies, and for monitoring storage stability.

  1. A hybrid model for predicting carbon monoxide from vehicular exhausts in urban environments

    NASA Astrophysics Data System (ADS)

    Gokhale, Sharad; Khare, Mukesh

    Several deterministic-based air quality models evaluate and predict the frequently occurring pollutant concentration well but, in general, are incapable of predicting the 'extreme' concentrations. In contrast, the statistical distribution models overcome the above limitation of the deterministic models and predict the 'extreme' concentrations. However, the environmental damages are caused by both extremes as well as by the sustained average concentration of pollutants. Hence, the model should predict not only 'extreme' ranges but also the 'middle' ranges of pollutant concentrations, i.e. the entire range. Hybrid modelling is one of the techniques that estimates/predicts the 'entire range' of the distribution of pollutant concentrations by combining the deterministic based models with suitable statistical distribution models ( Jakeman, et al., 1988). In the present paper, a hybrid model has been developed to predict the carbon monoxide (CO) concentration distributions at one of the traffic intersections, Income Tax Office (ITO), in the Delhi city, where the traffic is heterogeneous in nature and meteorology is 'tropical'. The model combines the general finite line source model (GFLSM) as its deterministic, and log logistic distribution (LLD) model, as its statistical components. The hybrid (GFLSM-LLD) model is then applied at the ITO intersection. The results show that the hybrid model predictions match with that of the observed CO concentration data within the 5-99 percentiles range. The model is further validated at different street location, i.e. Sirifort roadway. The validation results show that the model predicts CO concentrations fairly well ( d=0.91) in 10-95 percentiles range. The regulatory compliance is also developed to estimate the probability of exceedance of hourly CO concentration beyond the National Ambient Air Quality Standards (NAAQS) of India. It consists of light vehicles, heavy vehicles, three- wheelers (auto rickshaws) and two-wheelers (scooters, motorcycles, etc).

  2. Comparison of individual-based model output to data using a model of walleye pollock early life history in the Gulf of Alaska

    NASA Astrophysics Data System (ADS)

    Hinckley, Sarah; Parada, Carolina; Horne, John K.; Mazur, Michael; Woillez, Mathieu

    2016-10-01

    Biophysical individual-based models (IBMs) have been used to study aspects of early life history of marine fishes such as recruitment, connectivity of spawning and nursery areas, and marine reserve design. However, there is no consistent approach to validating the spatial outputs of these models. In this study, we hope to rectify this gap. We document additions to an existing individual-based biophysical model for Alaska walleye pollock (Gadus chalcogrammus), some simulations made with this model and methods that were used to describe and compare spatial output of the model versus field data derived from ichthyoplankton surveys in the Gulf of Alaska. We used visual methods (e.g. distributional centroids with directional ellipses), several indices (such as a Normalized Difference Index (NDI), and an Overlap Coefficient (OC), and several statistical methods: the Syrjala method, the Getis-Ord Gi* statistic, and a geostatistical method for comparing spatial indices. We assess the utility of these different methods in analyzing spatial output and comparing model output to data, and give recommendations for their appropriate use. Visual methods are useful for initial comparisons of model and data distributions. Metrics such as the NDI and OC give useful measures of co-location and overlap, but care must be taken in discretizing the fields into bins. The Getis-Ord Gi* statistic is useful to determine the patchiness of the fields. The Syrjala method is an easily implemented statistical measure of the difference between the fields, but does not give information on the details of the distributions. Finally, the geostatistical comparison of spatial indices gives good information of details of the distributions and whether they differ significantly between the model and the data. We conclude that each technique gives quite different information about the model-data distribution comparison, and that some are easy to apply and some more complex. We also give recommendations for a multistep process to validate spatial output from IBMs.

  3. Regional variability in the accuracy of statistical reproductions of historical time series of daily streamflow at ungaged locations

    NASA Astrophysics Data System (ADS)

    Farmer, W. H.; Archfield, S. A.; Over, T. M.; Kiang, J. E.

    2015-12-01

    In the United States and across the globe, the majority of stream reaches and rivers are substantially impacted by water use or remain ungaged. The result is large gaps in the availability of natural streamflow records from which to infer hydrologic understanding and inform water resources management. From basin-specific to continent-wide scales, many efforts have been undertaken to develop methods to estimate ungaged streamflow. This work applies and contrasts several statistical models of daily streamflow to more than 1,700 reference-quality streamgages across the conterminous United States using a cross-validation methodology. The variability of streamflow simulation performance across the country exhibits a pattern familiar to other continental scale modeling efforts performed for the United States. For portions of the West Coast and the dense, relatively homogeneous and humid regions of the eastern United States models produce reliable estimates of daily streamflow using many different prediction methods. Model performance for the middle portion of the United States, marked by more heterogeneous and arid conditions, and with larger contributing areas and sparser networks of streamgages, is consistently poor. A discussion of the difficulty of statistical interpolation and regionalization in these regions raises additional questions of data availability and quality, hydrologic process representation and dominance, and intrinsic variability.

  4. Three dimensional measurement of minimum joint space width in the knee from stereo radiographs using statistical shape models.

    PubMed

    van IJsseldijk, E A; Valstar, E R; Stoel, B C; Nelissen, R G H H; Baka, N; Van't Klooster, R; Kaptein, B L

    2016-08-01

    An important measure for the diagnosis and monitoring of knee osteoarthritis is the minimum joint space width (mJSW). This requires accurate alignment of the x-ray beam with the tibial plateau, which may not be accomplished in practice. We investigate the feasibility of a new mJSW measurement method from stereo radiographs using 3D statistical shape models (SSM) and evaluate its sensitivity to changes in the mJSW and its robustness to variations in patient positioning and bone geometry. A validation study was performed using five cadaver specimens. The actual mJSW was varied and images were acquired with variation in the cadaver positioning. For comparison purposes, the mJSW was also assessed from plain radiographs. To study the influence of SSM model accuracy, the 3D mJSW measurement was repeated with models from the actual bones, obtained from CT scans. The SSM-based measurement method was more robust (consistent output for a wide range of input data/consistent output under varying measurement circumstances) than the conventional 2D method, showing that the 3D reconstruction indeed reduces the influence of patient positioning. However, the SSM-based method showed comparable sensitivity to changes in the mJSW with respect to the conventional method. The CT-based measurement was more accurate than the SSM-based measurement (smallest detectable differences 0.55 mm versus 0. 82 mm, respectively). The proposed measurement method is not a substitute for the conventional 2D measurement due to limitations in the SSM model accuracy. However, further improvement of the model accuracy and optimisation technique can be obtained. Combined with the promising options for applications using quantitative information on bone morphology, SSM based 3D reconstructions of natural knees are attractive for further development.Cite this article: E. A. van IJsseldijk, E. R. Valstar, B. C. Stoel, R. G. H. H. Nelissen, N. Baka, R. van't Klooster, B. L. Kaptein. Three dimensional measurement of minimum joint space width in the knee from stereo radiographs using statistical shape models. Bone Joint Res 2016;320-327. DOI: 10.1302/2046-3758.58.2000626. © 2016 van IJsseldijk et al.

  5. Process-informed extreme value statistics- Why and how?

    NASA Astrophysics Data System (ADS)

    Schumann, Andreas; Fischer, Svenja

    2017-04-01

    In many parts of the world, annual maximum series (AMS) of runoff consist of flood peaks, which differ in their genesis. There are several aspects why these differences should be considered: Often multivariate flood characteristics (volumes, shapes) are of interest. These characteristics depend on the flood types. For regionalization, the main impacts on the flood regime has to be specified. If this regime depends on different flood types, type-specific hydro-meteorological and/or watershed characteristics are relevant. The ratios between event types often change over the range of observations. If a majority of events, which belongs to certain flood type, dominates the extrapolation of a probability distribution function (pdf), it is a problem if this more frequent type would not be typical for extraordinary large extremes, determining the right tail of the pdf. To consider differences in flood origin, several problems has to be solved. The events have to be separated into different groups according to their genesis. This can be a problem for long past events where e.g. precipitation data are not available. Another problem consists in the flood type-specific statistics. If block maxima are used, the sample of floods belong to a certain type is often incomplete as other events are overlaying smaller events. Some practical useable statistical tools to solve this and other problems are presented in a case study. Seasonal models were developed which differ between winter and summer floods but also between events with long and short timescales. The pdfs of the two groups of summer floods are combined via a new mixing model. The application to German watersheds demonstrates the advantages of the new model, giving specific influence to flood types.

  6. Climate Considerations Of The Electricity Supply Systems In Industries

    NASA Astrophysics Data System (ADS)

    Asset, Khabdullin; Zauresh, Khabdullina

    2014-12-01

    The study is focused on analysis of climate considerations of electricity supply systems in a pellet industry. The developed analysis model consists of two modules: statistical data of active power losses evaluation module and climate aspects evaluation module. The statistical data module is presented as a universal mathematical model of electrical systems and components of industrial load. It forms a basis for detailed accounting of power loss from the voltage levels. On the basis of the universal model, a set of programs is designed to perform the calculation and experimental research. It helps to obtain the statistical characteristics of the power losses and loads of the electricity supply systems and to define the nature of changes in these characteristics. Within the module, several methods and algorithms for calculating parameters of equivalent circuits of low- and high-voltage ADC and SD with a massive smooth rotor with laminated poles are developed. The climate aspects module includes an analysis of the experimental data of power supply system in pellet production. It allows identification of GHG emission reduction parameters: operation hours, type of electrical motors, values of load factor and deviation of standard value of voltage.

  7. A Bayesian Approach to Evaluating Consistency between Climate Model Output and Observations

    NASA Astrophysics Data System (ADS)

    Braverman, A. J.; Cressie, N.; Teixeira, J.

    2010-12-01

    Like other scientific and engineering problems that involve physical modeling of complex systems, climate models can be evaluated and diagnosed by comparing their output to observations of similar quantities. Though the global remote sensing data record is relatively short by climate research standards, these data offer opportunities to evaluate model predictions in new ways. For example, remote sensing data are spatially and temporally dense enough to provide distributional information that goes beyond simple moments to allow quantification of temporal and spatial dependence structures. In this talk, we propose a new method for exploiting these rich data sets using a Bayesian paradigm. For a collection of climate models, we calculate posterior probabilities its members best represent the physical system each seeks to reproduce. The posterior probability is based on the likelihood that a chosen summary statistic, computed from observations, would be obtained when the model's output is considered as a realization from a stochastic process. By exploring how posterior probabilities change with different statistics, we may paint a more quantitative and complete picture of the strengths and weaknesses of the models relative to the observations. We demonstrate our method using model output from the CMIP archive, and observations from NASA's Atmospheric Infrared Sounder.

  8. Estimating inverse probability weights using super learner when weight-model specification is unknown in a marginal structural Cox model context.

    PubMed

    Karim, Mohammad Ehsanul; Platt, Robert W

    2017-06-15

    Correct specification of the inverse probability weighting (IPW) model is necessary for consistent inference from a marginal structural Cox model (MSCM). In practical applications, researchers are typically unaware of the true specification of the weight model. Nonetheless, IPWs are commonly estimated using parametric models, such as the main-effects logistic regression model. In practice, assumptions underlying such models may not hold and data-adaptive statistical learning methods may provide an alternative. Many candidate statistical learning approaches are available in the literature. However, the optimal approach for a given dataset is impossible to predict. Super learner (SL) has been proposed as a tool for selecting an optimal learner from a set of candidates using cross-validation. In this study, we evaluate the usefulness of a SL in estimating IPW in four different MSCM simulation scenarios, in which we varied the specification of the true weight model specification (linear and/or additive). Our simulations show that, in the presence of weight model misspecification, with a rich and diverse set of candidate algorithms, SL can generally offer a better alternative to the commonly used statistical learning approaches in terms of MSE as well as the coverage probabilities of the estimated effect in an MSCM. The findings from the simulation studies guided the application of the MSCM in a multiple sclerosis cohort from British Columbia, Canada (1995-2008), to estimate the impact of beta-interferon treatment in delaying disability progression. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  9. A statistical model describing combined irreversible electroporation and electroporation-induced blood-brain barrier disruption.

    PubMed

    Sharabi, Shirley; Kos, Bor; Last, David; Guez, David; Daniels, Dianne; Harnof, Sagi; Mardor, Yael; Miklavcic, Damijan

    2016-03-01

    Electroporation-based therapies such as electrochemotherapy (ECT) and irreversible electroporation (IRE) are emerging as promising tools for treatment of tumors. When applied to the brain, electroporation can also induce transient blood-brain-barrier (BBB) disruption in volumes extending beyond IRE, thus enabling efficient drug penetration. The main objective of this study was to develop a statistical model predicting cell death and BBB disruption induced by electroporation. This model can be used for individual treatment planning. Cell death and BBB disruption models were developed based on the Peleg-Fermi model in combination with numerical models of the electric field. The model calculates the electric field thresholds for cell kill and BBB disruption and describes the dependence on the number of treatment pulses. The model was validated using in vivo experimental data consisting of rats brains MRIs post electroporation treatments. Linear regression analysis confirmed that the model described the IRE and BBB disruption volumes as a function of treatment pulses number (r(2) = 0.79; p < 0.008, r(2) = 0.91; p < 0.001). The results presented a strong plateau effect as the pulse number increased. The ratio between complete cell death and no cell death thresholds was relatively narrow (between 0.88-0.91) even for small numbers of pulses and depended weakly on the number of pulses. For BBB disruption, the ratio increased with the number of pulses. BBB disruption radii were on average 67% ± 11% larger than IRE volumes. The statistical model can be used to describe the dependence of treatment-effects on the number of pulses independent of the experimental setup.

  10. Matching asteroid population characteristics with a model constructed from the YORP-induced rotational fission hypothesis

    NASA Astrophysics Data System (ADS)

    Jacobson, Seth A.; Marzari, Francesco; Rossi, Alessandro; Scheeres, Daniel J.

    2016-10-01

    From the results of a comprehensive asteroid population evolution model, we conclude that the YORP-induced rotational fission hypothesis is consistent with the observed population statistics of small asteroids in the main belt including binaries and contact binaries. These conclusions rest on the asteroid rotation model of Marzari et al. ([2011]Icarus, 214, 622-631), which incorporates both the YORP effect and collisional evolution. This work adds to that model the rotational fission hypothesis, described in detail within, and the binary evolution model of Jacobson et al. ([2011a] Icarus, 214, 161-178) and Jacobson et al. ([2011b] The Astrophysical Journal Letters, 736, L19). Our complete asteroid population evolution model is highly constrained by these and other previous works, and therefore it has only two significant free parameters: the ratio of low to high mass ratio binaries formed after rotational fission events and the mean strength of the binary YORP (BYORP) effect. We successfully reproduce characteristic statistics of the small asteroid population: the binary fraction, the fast binary fraction, steady-state mass ratio fraction and the contact binary fraction. We find that in order for the model to best match observations, rotational fission produces high mass ratio (> 0.2) binary components with four to eight times the frequency as low mass ratio (<0.2) components, where the mass ratio is the mass of the secondary component divided by the mass of the primary component. This is consistent with post-rotational fission binary system mass ratio being drawn from either a flat or a positive and shallow distribution, since the high mass ratio bin is four times the size of the low mass ratio bin; this is in contrast to the observed steady-state binary mass ratio, which has a negative and steep distribution. This can be understood in the context of the BYORP-tidal equilibrium hypothesis, which predicts that low mass ratio binaries survive for a significantly longer period of time than high mass ratio systems. We also find that the mean of the log-normal BYORP coefficient distribution μB ≳10-2 , which is consistent with estimates from shape modeling (McMahon and Scheeres, 2012a).

  11. Influences of credibility of testimony and strength of statistical evidence on children's and adolescents' reasoning.

    PubMed

    Kail, Robert V

    2013-11-01

    According to dual-process models that include analytic and heuristic modes of processing, analytic processing is often expected to become more common with development. Consistent with this view, on reasoning problems, adolescents are more likely than children to select alternatives that are backed by statistical evidence. It is shown here that this pattern depends on the quality of the statistical evidence and the quality of the testimonial that is the typical alternative to statistical evidence. In Experiment 1, 9- and 13-year-olds (N=64) were presented with scenarios in which solid statistical evidence was contrasted with casual or expert testimonial evidence. When testimony was casual, children relied on it but adolescents did not; when testimony was expert, both children and adolescents relied on it. In Experiment 2, 9- and 13-year-olds (N=83) were presented with scenarios in which casual testimonial evidence was contrasted with weak or strong statistical evidence. When statistical evidence was weak, children and adolescents relied on both testimonial and statistical evidence; when statistical evidence was strong, most children and adolescents relied on it. Results are discussed in terms of their implications for dual-process accounts of cognitive development. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Simulation and analysis of scalable non-Gaussian statistically anisotropic random functions

    NASA Astrophysics Data System (ADS)

    Riva, Monica; Panzeri, Marco; Guadagnini, Alberto; Neuman, Shlomo P.

    2015-12-01

    Many earth and environmental (as well as other) variables, Y, and their spatial or temporal increments, ΔY, exhibit non-Gaussian statistical scaling. Previously we were able to capture some key aspects of such scaling by treating Y or ΔY as standard sub-Gaussian random functions. We were however unable to reconcile two seemingly contradictory observations, namely that whereas sample frequency distributions of Y (or its logarithm) exhibit relatively mild non-Gaussian peaks and tails, those of ΔY display peaks that grow sharper and tails that become heavier with decreasing separation distance or lag. Recently we overcame this difficulty by developing a new generalized sub-Gaussian model which captures both behaviors in a unified and consistent manner, exploring it on synthetically generated random functions in one dimension (Riva et al., 2015). Here we extend our generalized sub-Gaussian model to multiple dimensions, present an algorithm to generate corresponding random realizations of statistically isotropic or anisotropic sub-Gaussian functions and illustrate it in two dimensions. We demonstrate the accuracy of our algorithm by comparing ensemble statistics of Y and ΔY (such as, mean, variance, variogram and probability density function) with those of Monte Carlo generated realizations. We end by exploring the feasibility of estimating all relevant parameters of our model by analyzing jointly spatial moments of Y and ΔY obtained from a single realization of Y.

  13. Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.

    PubMed

    Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R

    2012-08-01

    Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.

  14. Trends and fluctuations in the severity of interstate wars

    PubMed Central

    Clauset, Aaron

    2018-01-01

    Since 1945, there have been relatively few large interstate wars, especially compared to the preceding 30 years, which included both World Wars. This pattern, sometimes called the long peace, is highly controversial. Does it represent an enduring trend caused by a genuine change in the underlying conflict-generating processes? Or is it consistent with a highly variable but otherwise stable system of conflict? Using the empirical distributions of interstate war sizes and onset times from 1823 to 2003, we parameterize stationary models of conflict generation that can distinguish trends from statistical fluctuations in the statistics of war. These models indicate that both the long peace and the period of great violence that preceded it are not statistically uncommon patterns in realistic but stationary conflict time series. This fact does not detract from the importance of the long peace or the proposed mechanisms that explain it. However, the models indicate that the postwar pattern of peace would need to endure at least another 100 to 140 years to become a statistically significant trend. This fact places an implicit upper bound on the magnitude of any change in the true likelihood of a large war after the end of the Second World War. The historical patterns of war thus seem to imply that the long peace may be substantially more fragile than proponents believe, despite recent efforts to identify mechanisms that reduce the likelihood of interstate wars. PMID:29507877

  15. Ergodicity of a singly-thermostated harmonic oscillator

    NASA Astrophysics Data System (ADS)

    Hoover, William Graham; Sprott, Julien Clinton; Hoover, Carol Griswold

    2016-03-01

    Although Nosé's thermostated mechanics is formally consistent with Gibbs' canonical ensemble, the thermostated Nosé-Hoover (harmonic) oscillator, with its mean kinetic temperature controlled, is far from ergodic. Much of its phase space is occupied by regular conservative tori. Oscillator ergodicity has previously been achieved by controlling two oscillator moments with two thermostat variables. Here we use computerized searches in conjunction with visualization to find singly-thermostated motion equations for the oscillator which are consistent with Gibbs' canonical distribution. Such models are the simplest able to bridge the gap between Gibbs' statistical ensembles and Newtonian single-particle dynamics.

  16. Estimating preferential flow in karstic aquifers using statistical mixed models.

    PubMed

    Anaya, Angel A; Padilla, Ingrid; Macchiavelli, Raul; Vesper, Dorothy J; Meeker, John D; Alshawabkeh, Akram N

    2014-01-01

    Karst aquifers are highly productive groundwater systems often associated with conduit flow. These systems can be highly vulnerable to contamination, resulting in a high potential for contaminant exposure to humans and ecosystems. This work develops statistical models to spatially characterize flow and transport patterns in karstified limestone and determines the effect of aquifer flow rates on these patterns. A laboratory-scale Geo-HydroBed model is used to simulate flow and transport processes in a karstic limestone unit. The model consists of stainless steel tanks containing a karstified limestone block collected from a karst aquifer formation in northern Puerto Rico. Experimental work involves making a series of flow and tracer injections, while monitoring hydraulic and tracer response spatially and temporally. Statistical mixed models (SMMs) are applied to hydraulic data to determine likely pathways of preferential flow in the limestone units. The models indicate a highly heterogeneous system with dominant, flow-dependent preferential flow regions. Results indicate that regions of preferential flow tend to expand at higher groundwater flow rates, suggesting a greater volume of the system being flushed by flowing water at higher rates. Spatial and temporal distribution of tracer concentrations indicates the presence of conduit-like and diffuse flow transport in the system, supporting the notion of both combined transport mechanisms in the limestone unit. The temporal response of tracer concentrations at different locations in the model coincide with, and confirms the preferential flow distribution generated with the SMMs used in the study. © 2013, National Ground Water Association.

  17. A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis.

    PubMed

    Gonzalez, Oscar; MacKinnon, David P

    Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to an outcome. However, current methods do not allow researchers to study the relationships between general and specific aspects of a construct to an outcome simultaneously. This study proposes a bifactor measurement model for the mediating construct as a way to parse variance and represent the general aspect and specific facets of a construct simultaneously. Monte Carlo simulation results are presented to help determine the properties of mediated effect estimation when the mediator has a bifactor structure and a specific facet of a construct is the true mediator. This study also investigates the conditions when researchers can detect the mediated effect when the multidimensionality of the mediator is ignored and treated as unidimensional. Simulation results indicated that the mediation model with a bifactor mediator measurement model had unbiased and adequate power to detect the mediated effect with a sample size greater than 500 and medium a - and b -paths. Also, results indicate that parameter bias and detection of the mediated effect in both the data-generating model and the misspecified model varies as a function of the amount of facet variance represented in the mediation model. This study contributes to the largely unexplored area of measurement issues in statistical mediation analysis.

  18. Free-space optical communication through a forest canopy.

    PubMed

    Edwards, Clinton L; Davis, Christopher C

    2006-01-01

    We model the effects of the leaves of mature broadleaf (deciduous) trees on air-to-ground free-space optical communication systems operating through the leaf canopy. The concept of leaf area index (LAI) is reviewed and related to a probabilistic model of foliage consisting of obscuring leaves randomly distributed throughout a treetop layer. Individual leaves are opaque. The expected fractional unobscured area statistic is derived as well as the variance around the expected value. Monte Carlo simulation results confirm the predictions of this probabilistic model. To verify the predictions of the statistical model experimentally, a passive optical technique has been used to make measurements of observed sky illumination in a mature broadleaf environment. The results of the measurements, as a function of zenith angle, provide strong evidence for the applicability of the model, and a single parameter fit to the data reinforces a natural connection to LAI. Specific simulations of signal-to-noise ratio degradation as a function of zenith angle in a specific ground-to-unmanned aerial vehicle communication situation have demonstrated the effect of obscuration on performance.

  19. VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA

    PubMed Central

    Garcia, Ramon I.; Ibrahim, Joseph G.; Zhu, Hongtu

    2009-01-01

    We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simultaneously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selection procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demonstrate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology. PMID:20336190

  20. SU-E-T-503: IMRT Optimization Using Monte Carlo Dose Engine: The Effect of Statistical Uncertainty.

    PubMed

    Tian, Z; Jia, X; Graves, Y; Uribe-Sanchez, A; Jiang, S

    2012-06-01

    With the development of ultra-fast GPU-based Monte Carlo (MC) dose engine, it becomes clinically realistic to compute the dose-deposition coefficients (DDC) for IMRT optimization using MC simulation. However, it is still time-consuming if we want to compute DDC with small statistical uncertainty. This work studies the effects of the statistical error in DDC matrix on IMRT optimization. The MC-computed DDC matrices are simulated here by adding statistical uncertainties at a desired level to the ones generated with a finite-size pencil beam algorithm. A statistical uncertainty model for MC dose calculation is employed. We adopt a penalty-based quadratic optimization model and gradient descent method to optimize fluence map and then recalculate the corresponding actual dose distribution using the noise-free DDC matrix. The impacts of DDC noise are assessed in terms of the deviation of the resulted dose distributions. We have also used a stochastic perturbation theory to theoretically estimate the statistical errors of dose distributions on a simplified optimization model. A head-and-neck case is used to investigate the perturbation to IMRT plan due to MC's statistical uncertainty. The relative errors of the final dose distributions of the optimized IMRT are found to be much smaller than those in the DDC matrix, which is consistent with our theoretical estimation. When history number is decreased from 108 to 106, the dose-volume-histograms are still very similar to the error-free DVHs while the error in DDC is about 3.8%. The results illustrate that the statistical errors in the DDC matrix have a relatively small effect on IMRT optimization in dose domain. This indicates we can use relatively small number of histories to obtain the DDC matrix with MC simulation within a reasonable amount of time, without considerably compromising the accuracy of the optimized treatment plan. This work is supported by Varian Medical Systems through a Master Research Agreement. © 2012 American Association of Physicists in Medicine.

  1. Knowledge Extraction from Atomically Resolved Images.

    PubMed

    Vlcek, Lukas; Maksov, Artem; Pan, Minghu; Vasudevan, Rama K; Kalinin, Sergei V

    2017-10-24

    Tremendous strides in experimental capabilities of scanning transmission electron microscopy and scanning tunneling microscopy (STM) over the past 30 years made atomically resolved imaging routine. However, consistent integration and use of atomically resolved data with generative models is unavailable, so information on local thermodynamics and other microscopic driving forces encoded in the observed atomic configurations remains hidden. Here, we present a framework based on statistical distance minimization to consistently utilize the information available from atomic configurations obtained from an atomically resolved image and extract meaningful physical interaction parameters. We illustrate the applicability of the framework on an STM image of a FeSe x Te 1-x superconductor, with the segregation of the chalcogen atoms investigated using a nonideal interacting solid solution model. This universal method makes full use of the microscopic degrees of freedom sampled in an atomically resolved image and can be extended via Bayesian inference toward unbiased model selection with uncertainty quantification.

  2. Retrograde spins of near-Earth asteroids from the Yarkovsky effect.

    PubMed

    La Spina, A; Paolicchi, P; Kryszczyńska, A; Pravec, P

    2004-03-25

    Dynamical resonances in the asteroid belt are the gateway for the production of near-Earth asteroids (NEAs). To generate the observed number of NEAs, however, requires the injection of many asteroids into those resonant regions. Collisional processes have long been claimed as a possible source, but difficulties with that idea have led to the suggestion that orbital drift arising from the Yarkovsky effect dominates the injection process. (The Yarkovsky effect is a force arising from differential heating-the 'afternoon' side of an asteroid is warmer than the 'morning' side.) The two models predict different rotational properties of NEAs: the usual collisional theories are consistent with a nearly isotropic distribution of rotation vectors, whereas the 'Yarkovsky model' predicts an excess of retrograde rotations. Here we report that the spin vectors of NEAs show a strong and statistically significant excess of retrograde rotations, quantitatively consistent with the theoretical expectations of the Yarkovsky model.

  3. Interpretation of scrape-off layer profile evolution and first-wall ion flux statistics on JET using a stochastic framework based on fillamentary motion

    NASA Astrophysics Data System (ADS)

    Walkden, N. R.; Wynn, A.; Militello, F.; Lipschultz, B.; Matthews, G.; Guillemaut, C.; Harrison, J.; Moulton, D.; Contributors, JET

    2017-08-01

    This paper presents the use of a novel modelling technique based around intermittent transport due to filament motion, to interpret experimental profile and fluctuation data in the scrape-off layer (SOL) of JET during the onset and evolution of a density profile shoulder. A baseline case is established, prior to shoulder formation, and the stochastic model is shown to be capable of simultaneously matching the time averaged profile measurement as well as the PDF shape and autocorrelation function from the ion-saturation current time series at the outer wall. Aspects of the stochastic model are then varied with the aim of producing a profile shoulder with statistical measurements consistent with experiment. This is achieved through a strong localised reduction in the density sink acting on the filaments within the model. The required reduction of the density sink occurs over a highly localised region with the timescale of the density sink increased by a factor of 25. This alone is found to be insufficient to model the expansion and flattening of the shoulder region as the density increases, which requires additional changes within the stochastic model. An example is found which includes both a reduction in the density sink and filament acceleration and provides a consistent match to the experimental data as the shoulder expands, though the uniqueness of this solution can not be guaranteed. Within the context of the stochastic model, this implies that the localised reduction in the density sink can trigger shoulder formation, but additional physics is required to explain the subsequent evolution of the profile.

  4. Transfer Entropy as a Log-Likelihood Ratio

    NASA Astrophysics Data System (ADS)

    Barnett, Lionel; Bossomaier, Terry

    2012-09-01

    Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.

  5. Transfer entropy as a log-likelihood ratio.

    PubMed

    Barnett, Lionel; Bossomaier, Terry

    2012-09-28

    Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.

  6. Fission fragment angular distributions in the reactions {sup 16}O+{sup 188}Os and {sup 28}Si+{sup 176}Yb

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tripathi, R.; Sudarshan, K.; Sharma, S. K.

    2009-06-15

    Fission fragment angular distributions have been measured in the reactions {sup 16}O+{sup 188}Os and {sup 28}Si+{sup 176}Yb to investigate the contribution from noncompound nucleus fission. Parameters for statistical model calculations were fixed using fission cross section data in the {sup 16}O+{sup 188}Os reaction. Experimental anisotropies were in reasonable agreement with those calculated using the statistical saddle point model for both reactions. The present results are also consistent with those of mass distribution studies in the fission of {sup 202}Po, formed in the reactions with varying entrance channel mass asymmetry. However, the present studies do not show a large fusion hindrancemore » as reported in the pre-actinide region based on the measurement of evaporation residue cross section.« less

  7. Statistical analysis of experimental multifragmentation events in 64Zn+112Sn at 40 MeV/nucleon

    NASA Astrophysics Data System (ADS)

    Lin, W.; Zheng, H.; Ren, P.; Liu, X.; Huang, M.; Wada, R.; Chen, Z.; Wang, J.; Xiao, G. Q.; Qu, G.

    2018-04-01

    A statistical multifragmentation model (SMM) is applied to the experimentally observed multifragmentation events in an intermediate heavy-ion reaction. Using the temperature and symmetry energy extracted from the isobaric yield ratio (IYR) method based on the modified Fisher model (MFM), SMM is applied to the reaction 64Zn+112Sn at 40 MeV/nucleon. The experimental isotope distribution and mass distribution of the primary reconstructed fragments are compared without afterburner and they are well reproduced. The extracted temperature T and symmetry energy coefficient asym from SMM simulated events, using the IYR method, are also consistent with those from the experiment. These results strongly suggest that in the multifragmentation process there is a freezeout volume, in which the thermal and chemical equilibrium is established before or at the time of the intermediate-mass fragments emission.

  8. Personality assessment and model comparison with behavioral data: A statistical framework and empirical demonstration with bonobos (Pan paniscus).

    PubMed

    Martin, Jordan S; Suarez, Scott A

    2017-08-01

    Interest in quantifying consistent among-individual variation in primate behavior, also known as personality, has grown rapidly in recent decades. Although behavioral coding is the most frequently utilized method for assessing primate personality, limitations in current statistical practice prevent researchers' from utilizing the full potential of their coding datasets. These limitations include the use of extensive data aggregation, not modeling biologically relevant sources of individual variance during repeatability estimation, not partitioning between-individual (co)variance prior to modeling personality structure, the misuse of principal component analysis, and an over-reliance upon exploratory statistical techniques to compare personality models across populations, species, and data collection methods. In this paper, we propose a statistical framework for primate personality research designed to address these limitations. Our framework synthesizes recently developed mixed-effects modeling approaches for quantifying behavioral variation with an information-theoretic model selection paradigm for confirmatory personality research. After detailing a multi-step analytic procedure for personality assessment and model comparison, we employ this framework to evaluate seven models of personality structure in zoo-housed bonobos (Pan paniscus). We find that differences between sexes, ages, zoos, time of observation, and social group composition contributed to significant behavioral variance. Independently of these factors, however, personality nonetheless accounted for a moderate to high proportion of variance in average behavior across observational periods. A personality structure derived from past rating research receives the strongest support relative to our model set. This model suggests that personality variation across the measured behavioral traits is best described by two correlated but distinct dimensions reflecting individual differences in affiliation and sociability (Agreeableness) as well as activity level, social play, and neophilia toward non-threatening stimuli (Openness). These results underscore the utility of our framework for quantifying personality in primates and facilitating greater integration between the behavioral ecological and comparative psychological approaches to personality research. © 2017 Wiley Periodicals, Inc.

  9. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    NASA Technical Reports Server (NTRS)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  10. Continuous distribution of emission states from single CdSe/ZnS quantum dots.

    PubMed

    Zhang, Kai; Chang, Hauyee; Fu, Aihua; Alivisatos, A Paul; Yang, Haw

    2006-04-01

    The photoluminescence dynamics of colloidal CdSe/ZnS/streptavidin quantum dots were studied using time-resolved single-molecule spectroscopy. Statistical tests of the photon-counting data suggested that the simple "on/off" discrete state model is inconsistent with experimental results. Instead, a continuous emission state distribution model was found to be more appropriate. Autocorrelation analysis of lifetime and intensity fluctuations showed a nonlinear correlation between them. These results were consistent with the model that charged quantum dots were also emissive, and that time-dependent charge migration gave rise to the observed photoluminescence dynamics.

  11. Quantifying discrimination of Framingham risk functions with different survival C statistics.

    PubMed

    Pencina, Michael J; D'Agostino, Ralph B; Song, Linye

    2012-07-10

    Cardiovascular risk prediction functions offer an important diagnostic tool for clinicians and patients themselves. They are usually constructed with the use of parametric or semi-parametric survival regression models. It is essential to be able to evaluate the performance of these models, preferably with summaries that offer natural and intuitive interpretations. The concept of discrimination, popular in the logistic regression context, has been extended to survival analysis. However, the extension is not unique. In this paper, we define discrimination in survival analysis as the model's ability to separate those with longer event-free survival from those with shorter event-free survival within some time horizon of interest. This definition remains consistent with that used in logistic regression, in the sense that it assesses how well the model-based predictions match the observed data. Practical and conceptual examples and numerical simulations are employed to examine four C statistics proposed in the literature to evaluate the performance of survival models. We observe that they differ in the numerical values and aspects of discrimination that they capture. We conclude that the index proposed by Harrell is the most appropriate to capture discrimination described by the above definition. We suggest researchers report which C statistic they are using, provide a rationale for their selection, and be aware that comparing different indices across studies may not be meaningful. Copyright © 2012 John Wiley & Sons, Ltd.

  12. Breaking the Vainshtein screening in clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Salzano, Vincenzo; Mota, David F.; Capozziello, Salvatore; Donahue, Megan

    2017-02-01

    In this work we will test an alternative model of gravity belonging to the large family of Galileon models. It is characterized by an intrinsic breaking of the Vainshtein mechanism inside large astrophysical objects, thus having possibly detectable observational signatures. We will compare theoretical predictions from this model with the observed total mass profile for a sample of clusters of galaxies. The profiles are derived using two complementary tools: x-ray hot intracluster gas dynamics, and strong and weak gravitational lensing. We find that a dependence with the dynamical internal status of each cluster is possible; for those clusters which are very close to be relaxed, and thus less perturbed by possible astrophysical local processes, the Galileon model gives a quite good fit to both x-ray and lensing observations. Both masses and concentrations for the dark matter halos are consistent with earlier results found in numerical simulations and in the literature, and no compelling statistical evidence for a deviation from general relativity is detectable from the present observational state. Actually, the characteristic Galileon parameter ϒ is always consistent with zero, and only an upper limit (≲0.086 at 1 σ , ≲0.16 at 2 σ , and ≲0.23 at 3 σ ) can be established. Some interesting distinctive deviations might be operative, but the statistical validity of the results is far from strong, and better data would be needed in order to either confirm or reject a potential tension with general relativity.

  13. QSPR using MOLGEN-QSPR: the challenge of fluoroalkane boiling points.

    PubMed

    Rücker, Christoph; Meringer, Markus; Kerber, Adalbert

    2005-01-01

    By means of the new software MOLGEN-QSPR, a multilinear regression model for the boiling points of lower fluoroalkanes is established. The model is based exclusively on simple descriptors derived directly from molecular structure and nevertheless describes a broader set of data more precisely than previous attempts that used either more demanding (quantum chemical) descriptors or more demanding (nonlinear) statistical methods such as neural networks. The model's internal consistency was confirmed by leave-one-out cross-validation. The model was used to predict all unknown boiling points of fluorobutanes, and the quality of predictions was estimated by means of comparison with boiling point predictions for fluoropentanes.

  14. A model for characterizing residential ground current and magnetic field fluctuations.

    PubMed

    Mader, D L; Peralta, S B; Sherar, M D

    1994-01-01

    The current through the residential grounding circuit is an important source for magnetic fields; field variations near the grounding circuit accurately track fluctuations in this ground current. In this paper, a model is presented which permits calculation of the range of these fluctuations. A discrete network model is used to simulate a local distribution system for a single street, and a statistical model to simulate unbalanced currents in the system. Simulations of three-house and ten-house networks show that random appliance operation leads to ground current fluctuations which can be quite large, on the order of 600%. This is consistent with measured fluctuations in an actual house.

  15. Can climate variability information constrain a hydrological model for an ungauged Costa Rican catchment?

    NASA Astrophysics Data System (ADS)

    Quesada-Montano, Beatriz; Westerberg, Ida K.; Fuentes-Andino, Diana; Hidalgo-Leon, Hugo; Halldin, Sven

    2017-04-01

    Long-term hydrological data are key to understanding catchment behaviour and for decision making within water management and planning. Given the lack of observed data in many regions worldwide, hydrological models are an alternative for reproducing historical streamflow series. Additional types of information - to locally observed discharge - can be used to constrain model parameter uncertainty for ungauged catchments. Climate variability exerts a strong influence on streamflow variability on long and short time scales, in particular in the Central-American region. We therefore explored the use of climate variability knowledge to constrain the simulated discharge uncertainty of a conceptual hydrological model applied to a Costa Rican catchment, assumed to be ungauged. To reduce model uncertainty we first rejected parameter relationships that disagreed with our understanding of the system. We then assessed how well climate-based constraints applied at long-term, inter-annual and intra-annual time scales could constrain model uncertainty. Finally, we compared the climate-based constraints to a constraint on low-flow statistics based on information obtained from global maps. We evaluated our method in terms of the ability of the model to reproduce the observed hydrograph and the active catchment processes in terms of two efficiency measures, a statistical consistency measure, a spread measure and 17 hydrological signatures. We found that climate variability knowledge was useful for reducing model uncertainty, in particular, unrealistic representation of deep groundwater processes. The constraints based on global maps of low-flow statistics provided more constraining information than those based on climate variability, but the latter rejected slow rainfall-runoff representations that the low flow statistics did not reject. The use of such knowledge, together with information on low-flow statistics and constraints on parameter relationships showed to be useful to constrain model uncertainty for an - assumed to be - ungauged basin. This shows that our method is promising for reconstructing long-term flow data for ungauged catchments on the Pacific side of Central America, and that similar methods can be developed for ungauged basins in other regions where climate variability exerts a strong control on streamflow variability.

  16. The self-consistency model of subjective confidence.

    PubMed

    Koriat, Asher

    2012-01-01

    How do people monitor the correctness of their answers? A self-consistency model is proposed for the process underlying confidence judgments and their accuracy. In answering a 2-alternative question, participants are assumed to retrieve a sample of representations of the question and base their confidence on the consistency with which the chosen answer is supported across representations. Confidence is modeled by analogy to the calculation of statistical level of confidence (SLC) in testing hypotheses about a population and represents the participant's assessment of the likelihood that a new sample will yield the same choice. Assuming that participants draw representations from a commonly shared item-specific population of representations, predictions were derived regarding the function relating confidence to inter-participant consensus and intra-participant consistency for the more preferred (majority) and the less preferred (minority) choices. The predicted pattern was confirmed for several different tasks. The confidence-accuracy relationship was shown to be a by-product of the consistency-correctness relationship: It is positive because the answers that are consistently chosen are generally correct, but negative when the wrong answers tend to be favored. The overconfidence bias stems from the reliability-validity discrepancy: Confidence monitors reliability (or self-consistency), but its accuracy is evaluated in calibration studies against correctness. Simulation and empirical results suggest that response speed is a frugal cue for self-consistency, and its validity depends on the validity of self-consistency in predicting performance. Another mnemonic cue-accessibility, which is the overall amount of information that comes to mind-makes an added, independent contribution. Self-consistency and accessibility may correspond to the 2 parameters that affect SLC: sample variance and sample size.

  17. Dynamical-statistical seasonal prediction for western North Pacific typhoons based on APCC multi-models

    NASA Astrophysics Data System (ADS)

    Kim, Ok-Yeon; Kim, Hye-Mi; Lee, Myong-In; Min, Young-Mi

    2017-01-01

    This study aims at predicting the seasonal number of typhoons (TY) over the western North Pacific with an Asia-Pacific Climate Center (APCC) multi-model ensemble (MME)-based dynamical-statistical hybrid model. The hybrid model uses the statistical relationship between the number of TY during the typhoon season (July-October) and the large-scale key predictors forecasted by APCC MME for the same season. The cross validation result from the MME hybrid model demonstrates high prediction skill, with a correlation of 0.67 between the hindcasts and observation for 1982-2008. The cross validation from the hybrid model with individual models participating in MME indicates that there is no single model which consistently outperforms the other models in predicting typhoon number. Although the forecast skill of MME is not always the highest compared to that of each individual model, the skill of MME presents rather higher averaged correlations and small variance of correlations. Given large set of ensemble members from multi-models, a relative operating characteristic score reveals an 82 % (above-) and 78 % (below-normal) improvement for the probabilistic prediction of the number of TY. It implies that there is 82 % (78 %) probability that the forecasts can successfully discriminate between above normal (below-normal) from other years. The forecast skill of the hybrid model for the past 7 years (2002-2008) is more skillful than the forecast from the Tropical Storm Risk consortium. Using large set of ensemble members from multi-models, the APCC MME could provide useful deterministic and probabilistic seasonal typhoon forecasts to the end-users in particular, the residents of tropical cyclone-prone areas in the Asia-Pacific region.

  18. Development and evaluation of the Internalized Racism in Asian Americans Scale (IRAAS).

    PubMed

    Choi, Andrew Young; Israel, Tania; Maeda, Hotaka

    2017-01-01

    This article presents the development and psychometric evaluation of the Internalized Racism in Asian Americans Scale (IRAAS), which was designed to measure the degree to which Asian Americans internalized hostile attitudes and negative messages targeted toward their racial identity. Items were developed on basis of prior literature, vetted through expert feedback and cognitive interviews, and administered to 655 Asian American participants through Amazon Mechanical Turk. Exploratory factor analysis with a random subsample (n = 324) yielded a psychometrically robust preliminary measurement model consisting of 3 factors: Self-Negativity, Weakness Stereotypes, and Appearance Bias. Confirmatory factor analysis with a separate subsample (n = 331) indicated that the proposed correlated factors model was strongly consistent with the observed data. Factor determinacies were high and demonstrated that the specified items adequately measured their intended factors. Bifactor modeling further indicated that this multidimensionality could be univocally represented for the purpose of measurement, including the use of a mean total score representing a single continuum of internalized racism on which individuals vary. The IRAAS statistically predicted depressive symptoms, and demonstrated statistically significant correlations in theoretically expected directions with four dimensions of collective self-esteem. These results provide initial validity evidence supporting the use of the IRAAS to measure aspects of internalized racism in this population. Limitations and research implications are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  19. 3Drefine: an interactive web server for efficient protein structure refinement.

    PubMed

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-07-08

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Bayesian models for cost-effectiveness analysis in the presence of structural zero costs

    PubMed Central

    Baio, Gianluca

    2014-01-01

    Bayesian modelling for cost-effectiveness data has received much attention in both the health economics and the statistical literature, in recent years. Cost-effectiveness data are characterised by a relatively complex structure of relationships linking a suitable measure of clinical benefit (e.g. quality-adjusted life years) and the associated costs. Simplifying assumptions, such as (bivariate) normality of the underlying distributions, are usually not granted, particularly for the cost variable, which is characterised by markedly skewed distributions. In addition, individual-level data sets are often characterised by the presence of structural zeros in the cost variable. Hurdle models can be used to account for the presence of excess zeros in a distribution and have been applied in the context of cost data. We extend their application to cost-effectiveness data, defining a full Bayesian specification, which consists of a model for the individual probability of null costs, a marginal model for the costs and a conditional model for the measure of effectiveness (given the observed costs). We presented the model using a working example to describe its main features. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24343868

  1. Bayesian models for cost-effectiveness analysis in the presence of structural zero costs.

    PubMed

    Baio, Gianluca

    2014-05-20

    Bayesian modelling for cost-effectiveness data has received much attention in both the health economics and the statistical literature, in recent years. Cost-effectiveness data are characterised by a relatively complex structure of relationships linking a suitable measure of clinical benefit (e.g. quality-adjusted life years) and the associated costs. Simplifying assumptions, such as (bivariate) normality of the underlying distributions, are usually not granted, particularly for the cost variable, which is characterised by markedly skewed distributions. In addition, individual-level data sets are often characterised by the presence of structural zeros in the cost variable. Hurdle models can be used to account for the presence of excess zeros in a distribution and have been applied in the context of cost data. We extend their application to cost-effectiveness data, defining a full Bayesian specification, which consists of a model for the individual probability of null costs, a marginal model for the costs and a conditional model for the measure of effectiveness (given the observed costs). We presented the model using a working example to describe its main features. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.

  2. Wave optics simulation of statistically rough surface scatter

    NASA Astrophysics Data System (ADS)

    Lanari, Ann M.; Butler, Samuel D.; Marciniak, Michael; Spencer, Mark F.

    2017-09-01

    The bidirectional reflectance distribution function (BRDF) describes optical scatter from surfaces by relating the incident irradiance to the exiting radiance over the entire hemisphere. Laboratory verification of BRDF models and experimentally populated BRDF databases are hampered by sparsity of monochromatic sources and ability to statistically control the surface features. Numerical methods are able to control surface features, have wavelength agility, and via Fourier methods of wave propagation, may be used to fill the knowledge gap. Monte-Carlo techniques, adapted from turbulence simulations, generate Gaussian distributed and correlated surfaces with an area of 1 cm2 , RMS surface height of 2.5 μm, and correlation length of 100 μm. The surface is centered inside a Kirchhoff absorbing boundary with an area of 16 cm2 to prevent wrap around aliasing in the far field. These surfaces are uniformly illuminated at normal incidence with a unit amplitude plane-wave varying in wavelength from 3 μm to 5 μm. The resultant scatter is propagated to a detector in the far field utilizing multi-step Fresnel Convolution and observed at angles from -2 μrad to 2 μrad. The far field scatter is compared to both a physical wave optics BRDF model (Modified Beckmann Kirchhoff) and two microfacet BRDF Models (Priest, and Cook-Torrance). Modified Beckmann Kirchhoff, which accounts for diffraction, is consistent with simulated scatter for multiple wavelengths for RMS surface heights greater than λ/2. The microfacet models, which assume geometric optics, are less consistent across wavelengths. Both model types over predict far field scatter width for RMS surface heights less than λ/2.

  3. Non-Immunogenic Structurally and Biologically Intact Tissue Matrix Grafts for the Immediate Repair of Ballistic-Induced Vascular and Nerve Tissue Injury in Combat Casualty Care

    DTIC Science & Technology

    2005-07-01

    as an access graft is addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the...addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the sample of grafts tested in...measured using a refractometer (Brix % method). The equilibration data are shown in Graph 1. The results suggest the following equilibration scheme: 40% v/v

  4. Statistical assessment of bi-exponential diffusion weighted imaging signal characteristics induced by intravoxel incoherent motion in malignant breast tumors

    PubMed Central

    Wong, Oi Lei; Lo, Gladys G.; Chan, Helen H. L.; Wong, Ting Ting; Cheung, Polly S. Y.

    2016-01-01

    Background The purpose of this study is to statistically assess whether bi-exponential intravoxel incoherent motion (IVIM) model better characterizes diffusion weighted imaging (DWI) signal of malignant breast tumor than mono-exponential Gaussian diffusion model. Methods 3 T DWI data of 29 malignant breast tumors were retrospectively included. Linear least-square mono-exponential fitting and segmented least-square bi-exponential fitting were used for apparent diffusion coefficient (ADC) and IVIM parameter quantification, respectively. F-test and Akaike Information Criterion (AIC) were used to statistically assess the preference of mono-exponential and bi-exponential model using region-of-interests (ROI)-averaged and voxel-wise analysis. Results For ROI-averaged analysis, 15 tumors were significantly better fitted by bi-exponential function and 14 tumors exhibited mono-exponential behavior. The calculated ADC, D (true diffusion coefficient) and f (pseudo-diffusion fraction) showed no significant differences between mono-exponential and bi-exponential preferable tumors. Voxel-wise analysis revealed that 27 tumors contained more voxels exhibiting mono-exponential DWI decay while only 2 tumors presented more bi-exponential decay voxels. ADC was consistently and significantly larger than D for both ROI-averaged and voxel-wise analysis. Conclusions Although the presence of IVIM effect in malignant breast tumors could be suggested, statistical assessment shows that bi-exponential fitting does not necessarily better represent the DWI signal decay in breast cancer under clinically typical acquisition protocol and signal-to-noise ratio (SNR). Our study indicates the importance to statistically examine the breast cancer DWI signal characteristics in practice. PMID:27709078

  5. Statistical wave climate projections for coastal impact assessments

    NASA Astrophysics Data System (ADS)

    Camus, P.; Losada, I. J.; Izaguirre, C.; Espejo, A.; Menéndez, M.; Pérez, J.

    2017-09-01

    Global multimodel wave climate projections are obtained at 1.0° × 1.0° scale from 30 Coupled Model Intercomparison Project Phase 5 (CMIP5) global circulation model (GCM) realizations. A semi-supervised weather-typing approach based on a characterization of the ocean wave generation areas and the historical wave information from the recent GOW2 database are used to train the statistical model. This framework is also applied to obtain high resolution projections of coastal wave climate and coastal impacts as port operability and coastal flooding. Regional projections are estimated using the collection of weather types at spacing of 1.0°. This assumption is feasible because the predictor is defined based on the wave generation area and the classification is guided by the local wave climate. The assessment of future changes in coastal impacts is based on direct downscaling of indicators defined by empirical formulations (total water level for coastal flooding and number of hours per year with overtopping for port operability). Global multimodel projections of the significant wave height and peak period are consistent with changes obtained in previous studies. Statistical confidence of expected changes is obtained due to the large number of GCMs to construct the ensemble. The proposed methodology is proved to be flexible to project wave climate at different spatial scales. Regional changes of additional variables as wave direction or other statistics can be estimated from the future empirical distribution with extreme values restricted to high percentiles (i.e., 95th, 99th percentiles). The statistical framework can also be applied to evaluate regional coastal impacts integrating changes in storminess and sea level rise.

  6. Estimation of seismic quality factor: Artificial neural networks and current approaches

    NASA Astrophysics Data System (ADS)

    Yıldırım, Eray; Saatçılar, Ruhi; Ergintav, Semih

    2017-01-01

    The aims of this study are to estimate soil attenuation using alternatives to traditional methods, to compare results of using these methods, and to examine soil properties using the estimated results. The performances of all methods, amplitude decay, spectral ratio, Wiener filter, and artificial neural network (ANN) methods, are examined on field and synthetic data with noise and without noise. High-resolution seismic reflection field data from Yeniköy (Arnavutköy, İstanbul) was used as field data, and 424 estimations of Q values were made for each method (1,696 total). While statistical tests on synthetic and field data are quite close to the Q value estimation results of ANN, Wiener filter, and spectral ratio methods, the amplitude decay methods showed a higher estimation error. According to previous geological and geophysical studies in this area, the soil is water-saturated, quite weak, consisting of clay and sandy units, and, because of current and past landslides in the study area and its vicinity, researchers reported heterogeneity in the soil. Under the same physical conditions, Q value calculated on field data can be expected to be 7.9 and 13.6. ANN models with various structures, training algorithm, input, and number of neurons are investigated. A total of 480 ANN models were generated consisting of 60 models for noise-free synthetic data, 360 models for different noise content synthetic data and 60 models to apply to the data collected in the field. The models were tested to determine the most appropriate structure and training algorithm. In the final ANN, the input vectors consisted of the difference of the width, energy, and distance of seismic traces, and the output was Q value. Success rate of both ANN methods with noise-free and noisy synthetic data were higher than the other three methods. Also according to the statistical tests on estimated Q value from field data, the method showed results that are more suitable. The Q value can be estimated practically and quickly by processing the traces with the recommended ANN model. Consequently, the ANN method could be used for estimating Q value from seismic data.

  7. The WiggleZ Dark Energy Survey: testing the cosmological model with baryon acoustic oscillations at z= 0.6

    NASA Astrophysics Data System (ADS)

    Blake, Chris; Davis, Tamara; Poole, Gregory B.; Parkinson, David; Brough, Sarah; Colless, Matthew; Contreras, Carlos; Couch, Warrick; Croom, Scott; Drinkwater, Michael J.; Forster, Karl; Gilbank, David; Gladders, Mike; Glazebrook, Karl; Jelliffe, Ben; Jurek, Russell J.; Li, I.-Hui; Madore, Barry; Martin, D. Christopher; Pimbblet, Kevin; Pracy, Michael; Sharp, Rob; Wisnioski, Emily; Woods, David; Wyder, Ted K.; Yee, H. K. C.

    2011-08-01

    We measure the imprint of baryon acoustic oscillations (BAOs) in the galaxy clustering pattern at the highest redshift achieved to date, z= 0.6, using the distribution of N= 132 509 emission-line galaxies in the WiggleZ Dark Energy Survey. We quantify BAOs using three statistics: the galaxy correlation function, power spectrum and the band-filtered estimator introduced by Xu et al. The results are mutually consistent, corresponding to a 4.0 per cent measurement of the cosmic distance-redshift relation at z= 0.6 [in terms of the acoustic parameter 'A(z)' introduced by Eisenstein et al., we find A(z= 0.6) = 0.452 ± 0.018]. Both BAOs and power spectrum shape information contribute towards these constraints. The statistical significance of the detection of the acoustic peak in the correlation function, relative to a wiggle-free model, is 3.2σ. The ratios of our distance measurements to those obtained using BAOs in the distribution of luminous red galaxies at redshifts z= 0.2 and 0.35 are consistent with a flat Λ cold dark matter model that also provides a good fit to the pattern of observed fluctuations in the cosmic microwave background radiation. The addition of the current WiggleZ data results in a ≈30 per cent improvement in the measurement accuracy of a constant equation of state, w, using BAO data alone. Based solely on geometric BAO distance ratios, accelerating expansion (w < -1/3) is required with a probability of 99.8 per cent, providing a consistency check of conclusions based on supernovae observations. Further improvements in cosmological constraints will result when the WiggleZ survey data set is complete.

  8. Thermal Dissociation and Roaming Isomerization of Nitromethane: Experiment and Theory.

    PubMed

    Annesley, Christopher J; Randazzo, John B; Klippenstein, Stephen J; Harding, Lawrence B; Jasper, Ahren W; Georgievskii, Yuri; Ruscic, Branko; Tranter, Robert S

    2015-07-16

    The thermal decomposition of nitromethane provides a classic example of the competition between roaming mediated isomerization and simple bond fission. A recent theoretical analysis suggests that as the pressure is increased from 2 to 200 Torr the product distribution undergoes a sharp transition from roaming dominated to bond-fission dominated. Laser schlieren densitometry is used to explore the variation in the effect of roaming on the density gradients for CH3NO2 decomposition in a shock tube for pressures of 30, 60, and 120 Torr at temperatures ranging from 1200 to 1860 K. A complementary theoretical analysis provides a novel exploration of the effects of roaming on the thermal decomposition kinetics. The analysis focuses on the roaming dynamics in a reduced dimensional space consisting of the rigid-body motions of the CH3 and NO2 radicals. A high-level reduced-dimensionality potential energy surface is developed from fits to large-scale multireference ab initio calculations. Rigid body trajectory simulations coupled with master equation kinetics calculations provide high-level a priori predictions for the thermal branching between roaming and dissociation. A statistical model provides a qualitative/semiquantitative interpretation of the results. Modeling efforts explore the relation between the predicted roaming branching and the observed gradients. Overall, the experiments are found to be fairly consistent with the theoretically proposed branching ratio, but they are also consistent with a no-roaming scenario and the underlying reasons are discussed. The theoretical predictions are also compared with prior theoretical predictions, with a related statistical model, and with the extant experimental data for the decomposition of CH3NO2, and for the reaction of CH3 with NO2.

  9. Statistical Evidence Suggests that Inattention Drives Hyperactivity/Impulsivity in Attention Deficit-Hyperactivity Disorder

    PubMed Central

    Sokolova, Elena; Groot, Perry; Claassen, Tom; van Hulzen, Kimm J.; Glennon, Jeffrey C.; Franke, Barbara

    2016-01-01

    Background Numerous factor analytic studies consistently support a distinction between two symptom domains of attention-deficit/hyperactivity disorder (ADHD), inattention and hyperactivity/impulsivity. Both dimensions show high internal consistency and moderate to strong correlations with each other. However, it is not clear what drives this strong correlation. The aim of this paper is to address this issue. Method We applied a sophisticated approach for causal discovery on three independent data sets of scores of the two ADHD dimensions in NeuroIMAGE (total N = 675), ADHD-200 (N = 245), and IMpACT (N = 164), assessed by different raters and instruments, and further used information on gender or a genetic risk haplotype. Results In all data sets we found strong statistical evidence for the same pattern: the clear dependence between hyperactivity/impulsivity symptom level and an established genetic factor (either gender or risk haplotype) vanishes when one conditions upon inattention symptom level. Under reasonable assumptions, e.g., that phenotypes do not cause genotypes, a causal model that is consistent with this pattern contains a causal path from inattention to hyperactivity/impulsivity. Conclusions The robust dependency cancellation observed in three different data sets suggests that inattention is a driving factor for hyperactivity/impulsivity. This causal hypothesis can be further validated in intervention studies. Our model suggests that interventions that affect inattention will also have an effect on the level of hyperactivity/impulsivity. On the other hand, interventions that affect hyperactivity/impulsivity would not change the level of inattention. This causal model may explain earlier findings on heritable factors causing ADHD reported in the study of twins with learning difficulties. PMID:27768717

  10. Modeling urbanization patterns at a global scale with generative adversarial networks

    NASA Astrophysics Data System (ADS)

    Albert, A. T.; Strano, E.; Gonzalez, M.

    2017-12-01

    Current demographic projections show that, in the next 30 years, global population growth will mostly take place in developing countries. Coupled with a decrease in density, such population growth could potentially double the land occupied by settlements by 2050. The lack of reliable and globally consistent socio-demographic data, coupled with the limited predictive performance underlying traditional urban spatial explicit models, call for developing better predictive methods, calibrated using a globally-consistent dataset. Thus, richer models of the spatial interplay between the urban built-up land, population distribution and energy use are central to the discussion around the expansion and development of cities, and their impact on the environment in the context of a changing climate. In this talk we discuss methods for, and present an analysis of, urban form, defined as the spatial distribution of macroeconomic quantities that characterize a city, using modern machine learning methods and best-available remote-sensing data for the world's largest 25,000 cities. We first show that these cities may be described by a small set of patterns in radial building density, nighttime luminosity, and population density, which highlight, to first order, differences in development and land use across the world. We observe significant, spatially-dependent variance around these typical patterns, which would be difficult to model using traditional statistical methods. We take a first step in addressing this challenge by developing CityGAN, a conditional generative adversarial network model for simulating realistic urban forms. To guide learning and measure the quality of the simulated synthetic cities, we develop a specialized loss function for GAN optimization that incorporates standard spatial statistics used by urban analysis experts. Our framework is a stark departure from both the standard physics-based approaches in the literature (that view urban forms as fractals with a scale-free behavior), and the traditional statistical learning approaches (whereby values of individual pixels are modeled as functions of locally-defined, hand-engineered features). This is a first-of-its-kind analysis of urban forms using data at a planetary scale.

  11. Complete integrability of information processing by biochemical reactions

    PubMed Central

    Agliari, Elena; Barra, Adriano; Dello Schiavo, Lorenzo; Moro, Antonio

    2016-01-01

    Statistical mechanics provides an effective framework to investigate information processing in biochemical reactions. Within such framework far-reaching analogies are established among (anti-) cooperative collective behaviors in chemical kinetics, (anti-)ferromagnetic spin models in statistical mechanics and operational amplifiers/flip-flops in cybernetics. The underlying modeling – based on spin systems – has been proved to be accurate for a wide class of systems matching classical (e.g. Michaelis–Menten, Hill, Adair) scenarios in the infinite-size approximation. However, the current research in biochemical information processing has been focusing on systems involving a relatively small number of units, where this approximation is no longer valid. Here we show that the whole statistical mechanical description of reaction kinetics can be re-formulated via a mechanical analogy – based on completely integrable hydrodynamic-type systems of PDEs – which provides explicit finite-size solutions, matching recently investigated phenomena (e.g. noise-induced cooperativity, stochastic bi-stability, quorum sensing). The resulting picture, successfully tested against a broad spectrum of data, constitutes a neat rationale for a numerically effective and theoretically consistent description of collective behaviors in biochemical reactions. PMID:27812018

  12. Experimental Study of Quantum Graphs With and Without Time-Reversal Invariance

    NASA Astrophysics Data System (ADS)

    Anlage, Steven Mark; Fu, Ziyuan; Koch, Trystan; Antonsen, Thomas; Ott, Edward

    An experimental setup consisting of a microwave network is used to simulate quantum graphs. The random coupling model (RCM) is applied to describe the universal statistical properties of the system with and without time-reversal invariance. The networks which are large compared to the wavelength, are constructed from coaxial cables connected by T junctions, and by making nodes with circulators time-reversal invariance for microwave propagation in the networks can be broken. The results of experimental study of microwave networks with and without time-reversal invariance are presented both in frequency domain and time domain. With the measured S-parameter data of two-port networks, the impedance statistics and the nearest-neighbor spacing statistics are examined. Moreover, the experiments of time reversal mirrors for networks demonstrate that the reconstruction quality can be used to quantify the degree of the time-reversal invariance for wave propagation. Numerical models of networks are also presented to verify the time domain experiments. We acknowledge support under contract AFOSR COE Grant FA9550-15-1-0171 and the ONR Grant N000141512134.

  13. Root Cause Analysis of Quality Defects Using HPLC-MS Fingerprint Knowledgebase for Batch-to-batch Quality Control of Herbal Drugs.

    PubMed

    Yan, Binjun; Fang, Zhonghua; Shen, Lijuan; Qu, Haibin

    2015-01-01

    The batch-to-batch quality consistency of herbal drugs has always been an important issue. To propose a methodology for batch-to-batch quality control based on HPLC-MS fingerprints and process knowledgebase. The extraction process of Compound E-jiao Oral Liquid was taken as a case study. After establishing the HPLC-MS fingerprint analysis method, the fingerprints of the extract solutions produced under normal and abnormal operation conditions were obtained. Multivariate statistical models were built for fault detection and a discriminant analysis model was built using the probabilistic discriminant partial-least-squares method for fault diagnosis. Based on multivariate statistical analysis, process knowledge was acquired and the cause-effect relationship between process deviations and quality defects was revealed. The quality defects were detected successfully by multivariate statistical control charts and the type of process deviations were diagnosed correctly by discriminant analysis. This work has demonstrated the benefits of combining HPLC-MS fingerprints, process knowledge and multivariate analysis for the quality control of herbal drugs. Copyright © 2015 John Wiley & Sons, Ltd.

  14. Brownian motion or Lévy walk? Stepping towards an extended statistical mechanics for animal locomotion.

    PubMed

    Gautestad, Arild O

    2012-09-07

    Animals moving under the influence of spatio-temporal scaling and long-term memory generate a kind of space-use pattern that has proved difficult to model within a coherent theoretical framework. An extended kind of statistical mechanics is needed, accounting for both the effects of spatial memory and scale-free space use, and put into a context of ecological conditions. Simulations illustrating the distinction between scale-specific and scale-free locomotion are presented. The results show how observational scale (time lag between relocations of an individual) may critically influence the interpretation of the underlying process. In this respect, a novel protocol is proposed as a method to distinguish between some main movement classes. For example, the 'power law in disguise' paradox-from a composite Brownian motion consisting of a superposition of independent movement processes at different scales-may be resolved by shifting the focus from pattern analysis at one particular temporal resolution towards a more process-oriented approach involving several scales of observation. A more explicit consideration of system complexity within a statistical mechanical framework, supplementing the more traditional mechanistic modelling approach, is advocated.

  15. Higher-Order Statistical Correlations and Mutual Information Among Particles in a Quantum Well

    NASA Astrophysics Data System (ADS)

    Yépez, V. S.; Sagar, R. P.; Laguna, H. G.

    2017-12-01

    The influence of wave function symmetry on statistical correlation is studied for the case of three non-interacting spin-free quantum particles in a unidimensional box, in position and in momentum space. Higher-order statistical correlations occurring among the three particles in this quantum system is quantified via higher-order mutual information and compared to the correlation between pairs of variables in this model, and to the correlation in the two-particle system. The results for the higher-order mutual information show that there are states where the symmetric wave functions are more correlated than the antisymmetric ones with same quantum numbers. This holds in position as well as in momentum space. This behavior is opposite to that observed for the correlation between pairs of variables in this model, and the two-particle system, where the antisymmetric wave functions are in general more correlated. These results are also consistent with those observed in a system of three uncoupled oscillators. The use of higher-order mutual information as a correlation measure, is monitored and examined by considering a superposition of states or systems with two Slater determinants.

  16. Complete integrability of information processing by biochemical reactions

    NASA Astrophysics Data System (ADS)

    Agliari, Elena; Barra, Adriano; Dello Schiavo, Lorenzo; Moro, Antonio

    2016-11-01

    Statistical mechanics provides an effective framework to investigate information processing in biochemical reactions. Within such framework far-reaching analogies are established among (anti-) cooperative collective behaviors in chemical kinetics, (anti-)ferromagnetic spin models in statistical mechanics and operational amplifiers/flip-flops in cybernetics. The underlying modeling - based on spin systems - has been proved to be accurate for a wide class of systems matching classical (e.g. Michaelis-Menten, Hill, Adair) scenarios in the infinite-size approximation. However, the current research in biochemical information processing has been focusing on systems involving a relatively small number of units, where this approximation is no longer valid. Here we show that the whole statistical mechanical description of reaction kinetics can be re-formulated via a mechanical analogy - based on completely integrable hydrodynamic-type systems of PDEs - which provides explicit finite-size solutions, matching recently investigated phenomena (e.g. noise-induced cooperativity, stochastic bi-stability, quorum sensing). The resulting picture, successfully tested against a broad spectrum of data, constitutes a neat rationale for a numerically effective and theoretically consistent description of collective behaviors in biochemical reactions.

  17. Complete integrability of information processing by biochemical reactions.

    PubMed

    Agliari, Elena; Barra, Adriano; Dello Schiavo, Lorenzo; Moro, Antonio

    2016-11-04

    Statistical mechanics provides an effective framework to investigate information processing in biochemical reactions. Within such framework far-reaching analogies are established among (anti-) cooperative collective behaviors in chemical kinetics, (anti-)ferromagnetic spin models in statistical mechanics and operational amplifiers/flip-flops in cybernetics. The underlying modeling - based on spin systems - has been proved to be accurate for a wide class of systems matching classical (e.g. Michaelis-Menten, Hill, Adair) scenarios in the infinite-size approximation. However, the current research in biochemical information processing has been focusing on systems involving a relatively small number of units, where this approximation is no longer valid. Here we show that the whole statistical mechanical description of reaction kinetics can be re-formulated via a mechanical analogy - based on completely integrable hydrodynamic-type systems of PDEs - which provides explicit finite-size solutions, matching recently investigated phenomena (e.g. noise-induced cooperativity, stochastic bi-stability, quorum sensing). The resulting picture, successfully tested against a broad spectrum of data, constitutes a neat rationale for a numerically effective and theoretically consistent description of collective behaviors in biochemical reactions.

  18. Flow throughout the Earth's core inverted from geomagnetic observations and numerical dynamo models

    NASA Astrophysics Data System (ADS)

    Aubert, Julien

    2013-02-01

    This paper introduces inverse geodynamo modelling, a framework imaging flow throughout the Earth's core from observations of the geomagnetic field and its secular variation. The necessary prior information is provided by statistics from 3-D and self-consistent numerical simulations of the geodynamo. The core method is a linear estimation (or Kalman filtering) procedure, combined with standard frozen-flux core surface flow inversions in order to handle the non-linearity of the problem. The inversion scheme is successfully validated using synthetic test experiments. A set of four numerical dynamo models of increasing physical complexity and similarity to the geomagnetic field is then used to invert for flows at single epochs within the period 1970-2010, using data from the geomagnetic field models CM4 and gufm-sat-Q3. The resulting core surface flows generally provide satisfactory fits to the secular variation within the level of modelled errors, and robustly reproduce the most commonly observed patterns while additionally presenting a high degree of equatorial symmetry. The corresponding deep flows present a robust, highly columnar structure once rotational constraints are enforced to a high level in the prior models, with patterns strikingly similar to the results of quasi-geostrophic inversions. In particular, the presence of a persistent planetary scale, eccentric westward columnar gyre circling around the inner core is confirmed. The strength of the approach is to uniquely determine the trade-off between fit to the data and complexity of the solution by clearly connecting it to first principle physics; statistical deviations observed between the inverted flows and the standard model behaviour can then be used to quantitatively assess the shortcomings of the physical modelling. Such deviations include the (i) westwards and (ii) hemispherical character of the eccentric gyre. A prior model with angular momentum conservation of the core-mantle inner-core system, and gravitational coupling of reasonable strength between the mantle and the inner core, is shown to produce enough westward drift to resolve statistical deviation (i). Deviation (ii) is resolved by a prior with an hemispherical buoyancy release at the inner-core boundary, with excess buoyancy below Asia. This latter result suggests that the recently proposed inner-core translational instability presently transports the solid inner-core material westwards, opposite to the seismologically inferred long-term trend but consistently with the eccentricity of the geomagnetic dipole in recent times.

  19. Jet Noise Diagnostics Supporting Statistical Noise Prediction Methods

    NASA Technical Reports Server (NTRS)

    Bridges, James E.

    2006-01-01

    The primary focus of my presentation is the development of the jet noise prediction code JeNo with most examples coming from the experimental work that drove the theoretical development and validation. JeNo is a statistical jet noise prediction code, based upon the Lilley acoustic analogy. Our approach uses time-average 2-D or 3-D mean and turbulent statistics of the flow as input. The output is source distributions and spectral directivity. NASA has been investing in development of statistical jet noise prediction tools because these seem to fit the middle ground that allows enough flexibility and fidelity for jet noise source diagnostics while having reasonable computational requirements. These tools rely on Reynolds-averaged Navier-Stokes (RANS) computational fluid dynamics (CFD) solutions as input for computing far-field spectral directivity using an acoustic analogy. There are many ways acoustic analogies can be created, each with a series of assumptions and models, many often taken unknowingly. And the resulting prediction can be easily reverse-engineered by altering the models contained within. However, only an approach which is mathematically sound, with assumptions validated and modeled quantities checked against direct measurement will give consistently correct answers. Many quantities are modeled in acoustic analogies precisely because they have been impossible to measure or calculate, making this requirement a difficult task. The NASA team has spent considerable effort identifying all the assumptions and models used to take the Navier-Stokes equations to the point of a statistical calculation via an acoustic analogy very similar to that proposed by Lilley. Assumptions have been identified and experiments have been developed to test these assumptions. In some cases this has resulted in assumptions being changed. Beginning with the CFD used as input to the acoustic analogy, models for turbulence closure used in RANS CFD codes have been explored and compared against measurements of mean and rms velocity statistics over a range of jet speeds and temperatures. Models for flow parameters used in the acoustic analogy, most notably the space-time correlations of velocity, have been compared against direct measurements, and modified to better fit the observed data. These measurements have been extremely challenging for hot, high speed jets, and represent a sizeable investment in instrumentation development. As an intermediate check that the analysis is predicting the physics intended, phased arrays have been employed to measure source distributions for a wide range of jet cases. And finally, careful far-field spectral directivity measurements have been taken for final validation of the prediction code. Examples of each of these experimental efforts will be presented. The main result of these efforts is a noise prediction code, named JeNo, which is in middevelopment. JeNo is able to consistently predict spectral directivity, including aft angle directivity, for subsonic cold jets of most geometries. Current development on JeNo is focused on extending its capability to hot jets, requiring inclusion of a previously neglected second source associated with thermal fluctuations. A secondary result of the intensive experimentation is the archiving of various flow statistics applicable to other acoustic analogies and to development of time-resolved prediction methods. These will be of lasting value as we look ahead at future challenges to the aeroacoustic experimentalist.

  20. QMRA for Drinking Water: 1. Revisiting the Mathematical Structure of Single-Hit Dose-Response Models.

    PubMed

    Nilsen, Vegard; Wyller, John

    2016-01-01

    Dose-response models are essential to quantitative microbial risk assessment (QMRA), providing a link between levels of human exposure to pathogens and the probability of negative health outcomes. In drinking water studies, the class of semi-mechanistic models known as single-hit models, such as the exponential and the exact beta-Poisson, has seen widespread use. In this work, an attempt is made to carefully develop the general mathematical single-hit framework while explicitly accounting for variation in (1) host susceptibility and (2) pathogen infectivity. This allows a precise interpretation of the so-called single-hit probability and precise identification of a set of statistical independence assumptions that are sufficient to arrive at single-hit models. Further analysis of the model framework is facilitated by formulating the single-hit models compactly using probability generating and moment generating functions. Among the more practically relevant conclusions drawn are: (1) for any dose distribution, variation in host susceptibility always reduces the single-hit risk compared to a constant host susceptibility (assuming equal mean susceptibilities), (2) the model-consistent representation of complete host immunity is formally demonstrated to be a simple scaling of the response, (3) the model-consistent expression for the total risk from repeated exposures deviates (gives lower risk) from the conventional expression used in applications, and (4) a model-consistent expression for the mean per-exposure dose that produces the correct total risk from repeated exposures is developed. © 2016 Society for Risk Analysis.

  1. A multi-criteria approach to identify favorable areas for goat production systems in Veracruz, México.

    PubMed

    Ramírez-Rivera, Emmanuel de Jesús; Lopez-Collado, Jose; Díaz-Rivera, Pablo; Ortega-Jiménez, Eusebio; Torres-Hernández, Glafiro; Jacinto-Padilla, Jazmín; Herman-Lara, Erasmo

    2017-04-01

    This research identifies favorable areas for goat production systems in the state of Veracruz, Mexico. Through the use of the analytic hierarchy process, layers of biophysical and soil information were combined to generate a model of favorability. Model validation was performed by calculating the area under the curve, the true skill statistic, and a qualitative comparison with census records. The results showed the existence of regions with high (4494.3 km 2 ) and moderate (2985.8 km 2 ) favorability, and these areas correspond to 6.25 and 4.15%, respectively, of the state territory and are located in the regions of Sierra de Huayacocotla, Perote, and Orizaba. These regions are characterized as mountainous and having predominantly temperate-wet or cold climates, and having montane mesophilic forests, containing pine, fir, and desert scrub. The reliability of the distribution model was supported by the area under the curve value (0.96), the true skill statistic (0.86), and consistency with census records.

  2. The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

    PubMed Central

    Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth

    2006-01-01

    Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497

  3. Discrete time rescaling theorem: determining goodness of fit for discrete time statistical models of neural spiking.

    PubMed

    Haslinger, Robert; Pipa, Gordon; Brown, Emery

    2010-10-01

    One approach for understanding the encoding of information by spike trains is to fit statistical models and then test their goodness of fit. The time-rescaling theorem provides a goodness-of-fit test consistent with the point process nature of spike trains. The interspike intervals (ISIs) are rescaled (as a function of the model's spike probability) to be independent and exponentially distributed if the model is accurate. A Kolmogorov-Smirnov (KS) test between the rescaled ISIs and the exponential distribution is then used to check goodness of fit. This rescaling relies on assumptions of continuously defined time and instantaneous events. However, spikes have finite width, and statistical models of spike trains almost always discretize time into bins. Here we demonstrate that finite temporal resolution of discrete time models prevents their rescaled ISIs from being exponentially distributed. Poor goodness of fit may be erroneously indicated even if the model is exactly correct. We present two adaptations of the time-rescaling theorem to discrete time models. In the first we propose that instead of assuming the rescaled times to be exponential, the reference distribution be estimated through direct simulation by the fitted model. In the second, we prove a discrete time version of the time-rescaling theorem that analytically corrects for the effects of finite resolution. This allows us to define a rescaled time that is exponentially distributed, even at arbitrary temporal discretizations. We demonstrate the efficacy of both techniques by fitting generalized linear models to both simulated spike trains and spike trains recorded experimentally in monkey V1 cortex. Both techniques give nearly identical results, reducing the false-positive rate of the KS test and greatly increasing the reliability of model evaluation based on the time-rescaling theorem.

  4. On size and geometry effects on the brittle fracture of ferritic and tempered martensitic steels

    NASA Astrophysics Data System (ADS)

    Odette, G. R.; Chao, B. L.; Lucas, G. E.

    1992-09-01

    A finite element computation of nonsingular crack tip fields was combined with a weakest link statistics model of cleavage fracture. Model predictions for three point bend specimens with various widths and crack depth to width ratios are qualitatively consistent with a number of trends observed in a 12 Cr martensitic stainless steel. The toughness “benefits” of small sizes and shallow cracks are primarily reflected in strain limits rather than net section stress capacities, which is significant to fusion structures subject to large secondary stresses.

  5. SPIPS: Spectro-Photo-Interferometry of Pulsating Stars

    NASA Astrophysics Data System (ADS)

    Mérand, Antoine

    2017-10-01

    SPIPS (Spectro-Photo-Interferometry of Pulsating Stars) combines radial velocimetry, interferometry, and photometry to estimate physical parameters of pulsating stars, including presence of infrared excess, color excess, Teff, and ratio distance/p-factor. The global model-based parallax-of-pulsation method is implemented in Python. Derived parameters have a high level of confidence; statistical precision is improved (compared to other methods) due to the large number of data taken into account, accuracy is improved by using consistent physical modeling and reliability of the derived parameters is strengthened by redundancy in the data.

  6. Using statistical models to explore ensemble uncertainty in climate impact studies: the example of air pollution in Europe

    NASA Astrophysics Data System (ADS)

    Lemaire, Vincent E. P.; Colette, Augustin; Menut, Laurent

    2016-03-01

    Because of its sensitivity to unfavorable weather patterns, air pollution is sensitive to climate change so that, in the future, a climate penalty could jeopardize the expected efficiency of air pollution mitigation measures. A common method to assess the impact of climate on air quality consists in implementing chemistry-transport models forced by climate projections. However, the computing cost of such methods requires optimizing ensemble exploration techniques. By using a training data set from a deterministic projection of climate and air quality over Europe, we identified the main meteorological drivers of air quality for eight regions in Europe and developed statistical models that could be used to predict air pollutant concentrations. The evolution of the key climate variables driving either particulate or gaseous pollution allows selecting the members of the EuroCordex ensemble of regional climate projections that should be used in priority for future air quality projections (CanESM2/RCA4; CNRM-CM5-LR/RCA4 and CSIRO-Mk3-6-0/RCA4 and MPI-ESM-LR/CCLM following the EuroCordex terminology). After having tested the validity of the statistical model in predictive mode, we can provide ranges of uncertainty attributed to the spread of the regional climate projection ensemble by the end of the century (2071-2100) for the RCP8.5. In the three regions where the statistical model of the impact of climate change on PM2.5 offers satisfactory performances, we find a climate benefit (a decrease of PM2.5 concentrations under future climate) of -1.08 (±0.21), -1.03 (±0.32), -0.83 (±0.14) µg m-3, for respectively Eastern Europe, Mid-Europe and Northern Italy. In the British-Irish Isles, Scandinavia, France, the Iberian Peninsula and the Mediterranean, the statistical model is not considered skillful enough to draw any conclusion for PM2.5. In Eastern Europe, France, the Iberian Peninsula, Mid-Europe and Northern Italy, the statistical model of the impact of climate change on ozone was considered satisfactory and it confirms the climate penalty bearing upon ozone of 10.51 (±3.06), 11.70 (±3.63), 11.53 (±1.55), 9.86 (±4.41), 4.82 (±1.79) µg m-3, respectively. In the British-Irish Isles, Scandinavia and the Mediterranean, the skill of the statistical model was not considered robust enough to draw any conclusion for ozone pollution.

  7. New scaling model for variables and increments with heavy-tailed distributions

    NASA Astrophysics Data System (ADS)

    Riva, Monica; Neuman, Shlomo P.; Guadagnini, Alberto

    2015-06-01

    Many hydrological (as well as diverse earth, environmental, ecological, biological, physical, social, financial and other) variables, Y, exhibit frequency distributions that are difficult to reconcile with those of their spatial or temporal increments, ΔY. Whereas distributions of Y (or its logarithm) are at times slightly asymmetric with relatively mild peaks and tails, those of ΔY tend to be symmetric with peaks that grow sharper, and tails that become heavier, as the separation distance (lag) between pairs of Y values decreases. No statistical model known to us captures these behaviors of Y and ΔY in a unified and consistent manner. We propose a new, generalized sub-Gaussian model that does so. We derive analytical expressions for probability distribution functions (pdfs) of Y and ΔY as well as corresponding lead statistical moments. In our model the peak and tails of the ΔY pdf scale with lag in line with observed behavior. The model allows one to estimate, accurately and efficiently, all relevant parameters by analyzing jointly sample moments of Y and ΔY. We illustrate key features of our new model and method of inference on synthetically generated samples and neutron porosity data from a deep borehole.

  8. Statistical Mechanics of US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward; Broedersz, Chase; Bialek, William; Biophysics Theory Group Team

    2014-03-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The least structured, or maximum entropy, model that is consistent with the observed pairwise correlations among justices' votes is equivalent to an Ising spin glass. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering some of our intuition that justices on opposite sides of the ideological spectrum should have a negative influence on one another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, and this agrees quantitatively with the data. The model shows that voting patterns are organized in a relatively simple ``energy landscape,'' correctly predicts the extent to which each justice is correlated with the majority, and gives us a measure of the influence that justices exert on one another. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context. Funded by National Science Foundation Grants PHY-0957573 and CCF-0939370, WM Keck Foundation, Lewis-Sigler Fellowship, Burroughs Wellcome Fund, and Winston Foundation.

  9. Modeling Stochastic Kinetics of Molecular Machines at Multiple Levels: From Molecules to Modules

    PubMed Central

    Chowdhury, Debashish

    2013-01-01

    A molecular machine is either a single macromolecule or a macromolecular complex. In spite of the striking superficial similarities between these natural nanomachines and their man-made macroscopic counterparts, there are crucial differences. Molecular machines in a living cell operate stochastically in an isothermal environment far from thermodynamic equilibrium. In this mini-review we present a catalog of the molecular machines and an inventory of the essential toolbox for theoretically modeling these machines. The tool kits include 1), nonequilibrium statistical-physics techniques for modeling machines and machine-driven processes; and 2), statistical-inference methods for reverse engineering a functional machine from the empirical data. The cell is often likened to a microfactory in which the machineries are organized in modular fashion; each module consists of strongly coupled multiple machines, but different modules interact weakly with each other. This microfactory has its own automated supply chain and delivery system. Buoyed by the success achieved in modeling individual molecular machines, we advocate integration of these models in the near future to develop models of functional modules. A system-level description of the cell from the perspective of molecular machinery (the mechanome) is likely to emerge from further integrations that we envisage here. PMID:23746505

  10. Statistical modeling of 4D respiratory lung motion using diffeomorphic image registration.

    PubMed

    Ehrhardt, Jan; Werner, René; Schmidt-Richberg, Alexander; Handels, Heinz

    2011-02-01

    Modeling of respiratory motion has become increasingly important in various applications of medical imaging (e.g., radiation therapy of lung cancer). Current modeling approaches are usually confined to intra-patient registration of 3D image data representing the individual patient's anatomy at different breathing phases. We propose an approach to generate a mean motion model of the lung based on thoracic 4D computed tomography (CT) data of different patients to extend the motion modeling capabilities. Our modeling process consists of three steps: an intra-subject registration to generate subject-specific motion models, the generation of an average shape and intensity atlas of the lung as anatomical reference frame, and the registration of the subject-specific motion models to the atlas in order to build a statistical 4D mean motion model (4D-MMM). Furthermore, we present methods to adapt the 4D mean motion model to a patient-specific lung geometry. In all steps, a symmetric diffeomorphic nonlinear intensity-based registration method was employed. The Log-Euclidean framework was used to compute statistics on the diffeomorphic transformations. The presented methods are then used to build a mean motion model of respiratory lung motion using thoracic 4D CT data sets of 17 patients. We evaluate the model by applying it for estimating respiratory motion of ten lung cancer patients. The prediction is evaluated with respect to landmark and tumor motion, and the quantitative analysis results in a mean target registration error (TRE) of 3.3 ±1.6 mm if lung dynamics are not impaired by large lung tumors or other lung disorders (e.g., emphysema). With regard to lung tumor motion, we show that prediction accuracy is independent of tumor size and tumor motion amplitude in the considered data set. However, tumors adhering to non-lung structures degrade local lung dynamics significantly and the model-based prediction accuracy is lower in these cases. The statistical respiratory motion model is capable of providing valuable prior knowledge in many fields of applications. We present two examples of possible applications in radiation therapy and image guided diagnosis.

  11. Characterisation of seasonal flood types according to timescales in mixed probability distributions

    NASA Astrophysics Data System (ADS)

    Fischer, Svenja; Schumann, Andreas; Schulte, Markus

    2016-08-01

    When flood statistics are based on annual maximum series (AMS), the sample often contains flood peaks, which differ in their genesis. If the ratios among event types change over the range of observations, the extrapolation of a probability distribution function (pdf) can be dominated by a majority of events that belong to a certain flood type. If this type is not typical for extraordinarily large extremes, such an extrapolation of the pdf is misleading. To avoid this breach of the assumption of homogeneity, seasonal models were developed that differ between winter and summer floods. We show that a distinction between summer and winter floods is not always sufficient if seasonal series include events with different geneses. Here, we differentiate floods by their timescales into groups of long and short events. A statistical method for such a distinction of events is presented. To demonstrate their applicability, timescales for winter and summer floods in a German river basin were estimated. It is shown that summer floods can be separated into two main groups, but in our study region, the sample of winter floods consists of at least three different flood types. The pdfs of the two groups of summer floods are combined via a new mixing model. This model considers that information about parallel events that uses their maximum values only is incomplete because some of the realisations are overlaid. A statistical method resulting in an amendment of statistical parameters is proposed. The application in a German case study demonstrates the advantages of the new model, with specific emphasis on flood types.

  12. Dark-ages reionization and galaxy formation simulation V: morphology and statistical signatures of reionization

    NASA Astrophysics Data System (ADS)

    Geil, Paul M.; Mutch, Simon J.; Poole, Gregory B.; Angel, Paul W.; Duffy, Alan R.; Mesinger, Andrei; Wyithe, J. Stuart B.

    2016-10-01

    We use the Dark-ages, Reionization And Galaxy formation Observables from Numerical Simulations (DRAGONS) framework to investigate the effect of galaxy formation physics on the morphology and statistics of ionized hydrogen (H II) regions during the Epoch of Reioinization (EoR). DRAGONS self-consistently couples a semi-analytic galaxy formation model with the inhomogeneous ionizing UV background, and can therefore be used to study the dependence of morphology and statistics of reionization on feedback phenomena of the ionizing source galaxy population. Changes in galaxy formation physics modify the sizes of H II regions and the amplitude and shape of 21-cm power spectra. Of the galaxy physics investigated, we find that supernova feedback plays the most important role in reionization, with H II regions up to ≈20 per cent smaller and a fractional difference in the amplitude of power spectra of up to ≈17 per cent at fixed ionized fraction in the absence of this feedback. We compare our galaxy formation-based reionization models with past calculations that assume constant stellar-to-halo mass ratios and find that with the correct choice of minimum halo mass, such models can mimic the predicted reionization morphology. Reionization morphology at fixed neutral fraction is therefore not uniquely determined by the details of galaxy formation, but is sensitive to the mass of the haloes hosting the bulk of the ionizing sources. Simple EoR parametrizations are therefore accurate predictors of reionization statistics. However, a complete understanding of reionization using future 21-cm observations will require interpretation with realistic galaxy formation models, in combination with other observations.

  13. Baseline models of trace elements in major aquifers of the United States

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace-element concentrations in baseline samples from a survey of aquifers used as potable-water supplies in the United States are summarized using methods appropriate for data with multiple detection limits. The resulting statistical distribution models are used to develop summary statistics and estimate probabilities of exceeding water-quality standards. The models are based on data from the major aquifer studies of the USGS National Water Quality Assessment (NAWQA) Program. These data were produced with a nationally-consistent sampling and analytical framework specifically designed to determine the quality of the most important potable groundwater resources during the years 1991-2001. The analytical data for all elements surveyed contain values that were below several detection limits. Such datasets are referred to as multiply-censored data. To address this issue, a robust semi-parametric statistical method called regression on order statistics (ROS) is employed. Utilizing the 90th-95th percentile as an arbitrary range for the upper limits of expected baseline concentrations, the models show that baseline concentrations of dissolved Ba and Zn are below 500 ??g/L. For the same percentile range, dissolved As, Cu and Mo concentrations are below 10 ??g/L, and dissolved Ag, Be, Cd, Co, Cr, Ni, Pb, Sb and Se are below 1-5 ??g/L. These models are also used to determine the probabilities that potable ground waters exceed drinking water standards. For dissolved Ba, Cr, Cu, Pb, Ni, Mo and Se, the likelihood of exceeding the US Environmental Protection Agency standards at the well-head is less than 1-1.5%. A notable exception is As, which has approximately a 7% chance of exceeding the maximum contaminant level (10 ??g/L) at the well head.

  14. Explaining nitrate pollution pressure on the groundwater resource in Kinshasa using a multivariate statistical modelling approach

    NASA Astrophysics Data System (ADS)

    Mfumu Kihumba, Antoine; Vanclooster, Marnik

    2013-04-01

    Drinking water in Kinshasa, the capital of the Democratic Republic of Congo, is provided by extracting groundwater from the local aquifer, particularly in peripheral areas. The exploited groundwater body is mainly unconfined and located within a continuous detrital aquifer, primarily composed of sedimentary formations. However, the aquifer is subjected to an increasing threat of anthropogenic pollution pressure. Understanding the detailed origin of this pollution pressure is important for sustainable drinking water management in Kinshasa. The present study aims to explain the observed nitrate pollution problem, nitrate being considered as a good tracer for other pollution threats. The analysis is made in terms of physical attributes that are readily available using a statistical modelling approach. For the nitrate data, use was made of a historical groundwater quality assessment study, for which the data were re-analysed. The physical attributes are related to the topography, land use, geology and hydrogeology of the region. Prior to the statistical modelling, intrinsic and specific vulnerability for nitrate pollution was assessed. This vulnerability assessment showed that the alluvium area in the northern part of the region is the most vulnerable area. This area consists of urban land use with poor sanitation. Re-analysis of the nitrate pollution data demonstrated that the spatial variability of nitrate concentrations in the groundwater body is high, and coherent with the fragmented land use of the region and the intrinsic and specific vulnerability maps. For the statistical modeling use was made of multiple regression and regression tree analysis. The results demonstrated the significant impact of land use variables on the Kinshasa groundwater nitrate pollution and the need for a detailed delineation of groundwater capture zones around the monitoring stations. Key words: Groundwater , Isotopic, Kinshasa, Modelling, Pollution, Physico-chemical.

  15. Statistical Analysis of the Impacts of Regional Transportation on the Air Quality in Beijing

    NASA Astrophysics Data System (ADS)

    Huang, Zhongwen; Zhang, Huiling; Tong, Lei; Xiao, Hang

    2016-04-01

    From October to December 2015, Beijing-Tianjin-Hebei (BTH) region had experienced several severe haze events. In order to assess the effects of the regional transportation on the air quality in Beijing, the air monitoring data (PM2.5, SO2, NO2 and CO) from that period published by Chinese National Environmental Monitoring Center (CNEMC) was collected and analyzed with various statistical models. The cities within BTH area were clustered into three groups according to the geographical conditions, while the air pollutant concentrations of cities within a group sharing similar variation trends. The Granger causality test results indicate that significant causal relationships exist between the air pollutant data of Beijing and its surrounding cities (Baoding, Chengde, Tianjin and Zhangjiakou) for the reference period. Then, linear regression models were constructed to capture the interdependency among the multiple time series. It shows that the observed air pollutant concentrations in Beijing were well consistent with the model-fitted results. More importantly, further analysis suggests that the air pollutants in Beijing were strongly affected by regional transportation, as the local sources only contributed 17.88%, 27.12%, 14.63% and 31.36% of PM2.5, SO2, NO2 and CO concentrations, respectively. And the major foreign source for Beijing was from Southwest (Baoding) direction, account for more than 42% of all these air pollutants. Thus, by combining various statistical models, it may not only be able to quickly predict the air qualities of any cities on a regional scale, but also to evaluate the local and regional source contributions for a particular city. Key words: regional transportation, air pollution, Granger causality test, statistical models

  16. Signatures of criticality arise from random subsampling in simple population models.

    PubMed

    Nonnenmacher, Marcel; Behrens, Christian; Berens, Philipp; Bethge, Matthias; Macke, Jakob H

    2017-10-01

    The rise of large-scale recordings of neuronal activity has fueled the hope to gain new insights into the collective activity of neural ensembles. How can one link the statistics of neural population activity to underlying principles and theories? One attempt to interpret such data builds upon analogies to the behaviour of collective systems in statistical physics. Divergence of the specific heat-a measure of population statistics derived from thermodynamics-has been used to suggest that neural populations are optimized to operate at a "critical point". However, these findings have been challenged by theoretical studies which have shown that common inputs can lead to diverging specific heat. Here, we connect "signatures of criticality", and in particular the divergence of specific heat, back to statistics of neural population activity commonly studied in neural coding: firing rates and pairwise correlations. We show that the specific heat diverges whenever the average correlation strength does not depend on population size. This is necessarily true when data with correlations is randomly subsampled during the analysis process, irrespective of the detailed structure or origin of correlations. We also show how the characteristic shape of specific heat capacity curves depends on firing rates and correlations, using both analytically tractable models and numerical simulations of a canonical feed-forward population model. To analyze these simulations, we develop efficient methods for characterizing large-scale neural population activity with maximum entropy models. We find that, consistent with experimental findings, increases in firing rates and correlation directly lead to more pronounced signatures. Thus, previous reports of thermodynamical criticality in neural populations based on the analysis of specific heat can be explained by average firing rates and correlations, and are not indicative of an optimized coding strategy. We conclude that a reliable interpretation of statistical tests for theories of neural coding is possible only in reference to relevant ground-truth models.

  17. Opinion Formation Models on a Gradient

    PubMed Central

    Gastner, Michael T.; Markou, Nikolitsa; Pruessner, Gunnar; Draief, Moez

    2014-01-01

    Statistical physicists have become interested in models of collective social behavior such as opinion formation, where individuals change their inherently preferred opinion if their friends disagree. Real preferences often depend on regional cultural differences, which we model here as a spatial gradient g in the initial opinion. The gradient does not only add reality to the model. It can also reveal that opinion clusters in two dimensions are typically in the standard (i.e., independent) percolation universality class, thus settling a recent controversy about a non-consensus model. However, using analytical and numerical tools, we also present a model where the width of the transition between opinions scales , not as in independent percolation, and the cluster size distribution is consistent with first-order percolation. PMID:25474528

  18. The effect of project-based learning on students' statistical literacy levels for data representation

    NASA Astrophysics Data System (ADS)

    Koparan, Timur; Güven, Bülent

    2015-07-01

    The point of this study is to define the effect of project-based learning approach on 8th Grade secondary-school students' statistical literacy levels for data representation. To achieve this goal, a test which consists of 12 open-ended questions in accordance with the views of experts was developed. Seventy 8th grade secondary-school students, 35 in the experimental group and 35 in the control group, took this test twice, one before the application and one after the application. All the raw scores were turned into linear points by using the Winsteps 3.72 modelling program that makes the Rasch analysis and t-tests, and an ANCOVA analysis was carried out with the linear points. Depending on the findings, it was concluded that the project-based learning approach increases students' level of statistical literacy for data representation. Students' levels of statistical literacy before and after the application were shown through the obtained person-item maps.

  19. Online Denoising Based on the Second-Order Adaptive Statistics Model.

    PubMed

    Yi, Sheng-Lun; Jin, Xue-Bo; Su, Ting-Li; Tang, Zhen-Yun; Wang, Fa-Fa; Xiang, Na; Kong, Jian-Lei

    2017-07-20

    Online denoising is motivated by real-time applications in the industrial process, where the data must be utilizable soon after it is collected. Since the noise in practical process is usually colored, it is quite a challenge for denoising techniques. In this paper, a novel online denoising method was proposed to achieve the processing of the practical measurement data with colored noise, and the characteristics of the colored noise were considered in the dynamic model via an adaptive parameter. The proposed method consists of two parts within a closed loop: the first one is to estimate the system state based on the second-order adaptive statistics model and the other is to update the adaptive parameter in the model using the Yule-Walker algorithm. Specifically, the state estimation process was implemented via the Kalman filter in a recursive way, and the online purpose was therefore attained. Experimental data in a reinforced concrete structure test was used to verify the effectiveness of the proposed method. Results show the proposed method not only dealt with the signals with colored noise, but also achieved a tradeoff between efficiency and accuracy.

  20. Preliminary constraints on variable w dark energy cosmologies from the SNLS

    NASA Astrophysics Data System (ADS)

    Carlberg, R. G.; Conley, A.; Howell, D. A.; Neill, J. D.; Perrett, K.; Pritchet, C. J.; Sullivan, M.

    2005-12-01

    The first 71 confirmed Ia supernovae from the Supernova Legacy Survey being conducted with CFHT imaging and Gemini, VLT and Keck spectroscopy set limits on variable dark energy cosmological models. For a generalized Chaplygin gas, in which the dark energy content is (1-Ω M)/ρ a, we find that a is statistically consistent with zero, with a best fit a=-0.2±-0.3 (68 systematic errors requires a further refinement of the photometric calibration and the potential model biases. A variable dark energy equation of state with w=w0+w_1 z shows the expected degeneracy between increasingly positive w0 and negative w1. The existing data rule out the parameters of the Weller & Linder (2002) Super-gravity inspired model cosmology (w0,w_1)=(-0.81,0.31). The full 700 Ia of the completed survey will provide a statistical error limit of w1 of about 0.2 and significant constraints on variable w models. The Canadian NSERC provided funding for the scientific analysis. These results are based on observations obtained at the CFHT, Gemini, VLT and Keck observatories.

  1. Rates of profit as correlated sums of random variables

    NASA Astrophysics Data System (ADS)

    Greenblatt, R. E.

    2013-10-01

    Profit realization is the dominant feature of market-based economic systems, determining their dynamics to a large extent. Rather than attaining an equilibrium, profit rates vary widely across firms, and the variation persists over time. Differing definitions of profit result in differing empirical distributions. To study the statistical properties of profit rates, I used data from a publicly available database for the US Economy for 2009-2010 (Risk Management Association). For each of three profit rate measures, the sample space consists of 771 points. Each point represents aggregate data from a small number of US manufacturing firms of similar size and type (NAICS code of principal product). When comparing the empirical distributions of profit rates, significant ‘heavy tails’ were observed, corresponding principally to a number of firms with larger profit rates than would be expected from simple models. An apparently novel correlated sum of random variables statistical model was used to model the data. In the case of operating and net profit rates, a number of firms show negative profits (losses), ruling out simple gamma or lognormal distributions as complete models for these data.

  2. Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment.

    PubMed

    Gierliński, Marek; Cole, Christian; Schofield, Pietà; Schurch, Nicholas J; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J

    2015-11-15

    High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of 'bad' replicates, which can drastically affect the gene read-count distribution. RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. g.j.barton@dundee.ac.uk. © The Author 2015. Published by Oxford University Press.

  3. Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment

    PubMed Central

    Cole, Christian; Schofield, Pietà; Schurch, Nicholas J.; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J.

    2015-01-01

    Motivation: High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. Results: A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of ‘bad’ replicates, which can drastically affect the gene read-count distribution. Availability and implementation: RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. Contact: g.j.barton@dundee.ac.uk PMID:26206307

  4. Crossing statistic: reconstructing the expansion history of the universe

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shafieloo, Arman, E-mail: arman@ewha.ac.kr

    2012-08-01

    We present that by combining Crossing Statistic [1,2] and Smoothing method [3-5] one can reconstruct the expansion history of the universe with a very high precision without considering any prior on the cosmological quantities such as the equation of state of dark energy. We show that the presented method performs very well in reconstruction of the expansion history of the universe independent of the underlying models and it works well even for non-trivial dark energy models with fast or slow changes in the equation of state of dark energy. Accuracy of the reconstructed quantities along with independence of the methodmore » to any prior or assumption gives the proposed method advantages to the other non-parametric methods proposed before in the literature. Applying on the Union 2.1 supernovae combined with WiggleZ BAO data we present the reconstructed results and test the consistency of the two data sets in a model independent manner. Results show that latest available supernovae and BAO data are in good agreement with each other and spatially flat ΛCDM model is in concordance with the current data.« less

  5. Numerical solutions of the semiclassical Boltzmann ellipsoidal-statistical kinetic model equation

    PubMed Central

    Yang, Jaw-Yen; Yan, Chin-Yuan; Huang, Juan-Chen; Li, Zhihui

    2014-01-01

    Computations of rarefied gas dynamical flows governed by the semiclassical Boltzmann ellipsoidal-statistical (ES) kinetic model equation using an accurate numerical method are presented. The semiclassical ES model was derived through the maximum entropy principle and conserves not only the mass, momentum and energy, but also contains additional higher order moments that differ from the standard quantum distributions. A different decoding procedure to obtain the necessary parameters for determining the ES distribution is also devised. The numerical method in phase space combines the discrete-ordinate method in momentum space and the high-resolution shock capturing method in physical space. Numerical solutions of two-dimensional Riemann problems for two configurations covering various degrees of rarefaction are presented and various contours of the quantities unique to this new model are illustrated. When the relaxation time becomes very small, the main flow features a display similar to that of ideal quantum gas dynamics, and the present solutions are found to be consistent with existing calculations for classical gas. The effect of a parameter that permits an adjustable Prandtl number in the flow is also studied. PMID:25104904

  6. Discrete Time Rescaling Theorem: Determining Goodness of Fit for Discrete Time Statistical Models of Neural Spiking

    PubMed Central

    Haslinger, Robert; Pipa, Gordon; Brown, Emery

    2010-01-01

    One approach for understanding the encoding of information by spike trains is to fit statistical models and then test their goodness of fit. The time rescaling theorem provides a goodness of fit test consistent with the point process nature of spike trains. The interspike intervals (ISIs) are rescaled (as a function of the model’s spike probability) to be independent and exponentially distributed if the model is accurate. A Kolmogorov Smirnov (KS) test between the rescaled ISIs and the exponential distribution is then used to check goodness of fit. This rescaling relies upon assumptions of continuously defined time and instantaneous events. However spikes have finite width and statistical models of spike trains almost always discretize time into bins. Here we demonstrate that finite temporal resolution of discrete time models prevents their rescaled ISIs from being exponentially distributed. Poor goodness of fit may be erroneously indicated even if the model is exactly correct. We present two adaptations of the time rescaling theorem to discrete time models. In the first we propose that instead of assuming the rescaled times to be exponential, the reference distribution be estimated through direct simulation by the fitted model. In the second, we prove a discrete time version of the time rescaling theorem which analytically corrects for the effects of finite resolution. This allows us to define a rescaled time which is exponentially distributed, even at arbitrary temporal discretizations. We demonstrate the efficacy of both techniques by fitting Generalized Linear Models (GLMs) to both simulated spike trains and spike trains recorded experimentally in monkey V1 cortex. Both techniques give nearly identical results, reducing the false positive rate of the KS test and greatly increasing the reliability of model evaluation based upon the time rescaling theorem. PMID:20608868

  7. A statistical model describing combined irreversible electroporation and electroporation-induced blood-brain barrier disruption

    PubMed Central

    Sharabi, Shirley; Kos, Bor; Last, David; Guez, David; Daniels, Dianne; Harnof, Sagi; Miklavcic, Damijan

    2016-01-01

    Background Electroporation-based therapies such as electrochemotherapy (ECT) and irreversible electroporation (IRE) are emerging as promising tools for treatment of tumors. When applied to the brain, electroporation can also induce transient blood-brain-barrier (BBB) disruption in volumes extending beyond IRE, thus enabling efficient drug penetration. The main objective of this study was to develop a statistical model predicting cell death and BBB disruption induced by electroporation. This model can be used for individual treatment planning. Material and methods Cell death and BBB disruption models were developed based on the Peleg-Fermi model in combination with numerical models of the electric field. The model calculates the electric field thresholds for cell kill and BBB disruption and describes the dependence on the number of treatment pulses. The model was validated using in vivo experimental data consisting of rats brains MRIs post electroporation treatments. Results Linear regression analysis confirmed that the model described the IRE and BBB disruption volumes as a function of treatment pulses number (r2 = 0.79; p < 0.008, r2 = 0.91; p < 0.001). The results presented a strong plateau effect as the pulse number increased. The ratio between complete cell death and no cell death thresholds was relatively narrow (between 0.88-0.91) even for small numbers of pulses and depended weakly on the number of pulses. For BBB disruption, the ratio increased with the number of pulses. BBB disruption radii were on average 67% ± 11% larger than IRE volumes. Conclusions The statistical model can be used to describe the dependence of treatment-effects on the number of pulses independent of the experimental setup. PMID:27069447

  8. Generalized t-statistic for two-group classification.

    PubMed

    Komori, Osamu; Eguchi, Shinto; Copas, John B

    2015-06-01

    In the classic discriminant model of two multivariate normal distributions with equal variance matrices, the linear discriminant function is optimal both in terms of the log likelihood ratio and in terms of maximizing the standardized difference (the t-statistic) between the means of the two distributions. In a typical case-control study, normality may be sensible for the control sample but heterogeneity and uncertainty in diagnosis may suggest that a more flexible model is needed for the cases. We generalize the t-statistic approach by finding the linear function which maximizes a standardized difference but with data from one of the groups (the cases) filtered by a possibly nonlinear function U. We study conditions for consistency of the method and find the function U which is optimal in the sense of asymptotic efficiency. Optimality may also extend to other measures of discriminatory efficiency such as the area under the receiver operating characteristic curve. The optimal function U depends on a scalar probability density function which can be estimated non-parametrically using a standard numerical algorithm. A lasso-like version for variable selection is implemented by adding L1-regularization to the generalized t-statistic. Two microarray data sets in the study of asthma and various cancers are used as motivating examples. © 2014, The International Biometric Society.

  9. Testing for nonlinearity in time series: The method of surrogate data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Theiler, J.; Galdrikian, B.; Longtin, A.

    1991-01-01

    We describe a statistical approach for identifying nonlinearity in time series; in particular, we want to avoid claims of chaos when simpler models (such as linearly correlated noise) can explain the data. The method requires a careful statement of the null hypothesis which characterizes a candidate linear process, the generation of an ensemble of surrogate'' data sets which are similar to the original time series but consistent with the null hypothesis, and the computation of a discriminating statistic for the original and for each of the surrogate data sets. The idea is to test the original time series against themore » null hypothesis by checking whether the discriminating statistic computed for the original time series differs significantly from the statistics computed for each of the surrogate sets. We present algorithms for generating surrogate data under various null hypotheses, and we show the results of numerical experiments on artificial data using correlation dimension, Lyapunov exponent, and forecasting error as discriminating statistics. Finally, we consider a number of experimental time series -- including sunspots, electroencephalogram (EEG) signals, and fluid convection -- and evaluate the statistical significance of the evidence for nonlinear structure in each case. 56 refs., 8 figs.« less

  10. Asymmetry of projected increases in extreme temperature distributions

    PubMed Central

    Kodra, Evan; Ganguly, Auroop R.

    2014-01-01

    A statistical analysis reveals projections of consistently larger increases in the highest percentiles of summer and winter temperature maxima and minima versus the respective lowest percentiles, resulting in a wider range of temperature extremes in the future. These asymmetric changes in tail distributions of temperature appear robust when explored through 14 CMIP5 climate models and three reanalysis datasets. Asymmetry of projected increases in temperature extremes generalizes widely. Magnitude of the projected asymmetry depends significantly on region, season, land-ocean contrast, and climate model variability as well as whether the extremes of consideration are seasonal minima or maxima events. An assessment of potential physical mechanisms provides support for asymmetric tail increases and hence wider temperature extremes ranges, especially for northern winter extremes. These results offer statistically grounded perspectives on projected changes in the IPCC-recommended extremes indices relevant for impacts and adaptation studies. PMID:25073751

  11. Dynamical evolution of topology of large-scale structure. [in distribution of galaxies

    NASA Technical Reports Server (NTRS)

    Park, Changbom; Gott, J. R., III

    1991-01-01

    The nonlinear effects of statistical biasing and gravitational evolution on the genus are studied. The biased galaxy subset is picked for the first time by actually identifying galaxy-sized peaks above a fixed threshold in the initial conditions, and their subsequent evolution is followed. It is found that in the standard cold dark matter (CDM) model the statistical biasing in the locations of galaxies produces asymmetry in the genus curve and coupling with gravitational evolution gives rise to a shift in the genus curve to the left in moderately nonlinear regimes. Gravitational evolution alone reduces the amplitude of the genus curve due to strong phase correlations in the density field and also produces asymmetry in the curve. Results on the genus of the mass density field for both CDM and hot dark matter models are consistent with previous work by Melott, Weinberg, and Gott (1987).

  12. Evaluation of the 29-km Eta Model. Part 1; Objective Verification at Three Selected Stations

    NASA Technical Reports Server (NTRS)

    Nutter, Paul A.; Manobianco, John; Merceret, Francis J. (Technical Monitor)

    1998-01-01

    This paper describes an objective verification of the National Centers for Environmental Prediction (NCEP) 29-km eta model from May 1996 through January 1998. The evaluation was designed to assess the model's surface and upper-air point forecast accuracy at three selected locations during separate warm (May - August) and cool (October - January) season periods. In order to enhance sample sizes available for statistical calculations, the objective verification includes two consecutive warm and cool season periods. Systematic model deficiencies comprise the larger portion of the total error in most of the surface forecast variables that were evaluated. The error characteristics for both surface and upper-air forecasts vary widely by parameter, season, and station location. At upper levels, a few characteristic biases are identified. Overall however, the upper-level errors are more nonsystematic in nature and could be explained partly by observational measurement uncertainty. With a few exceptions, the upper-air results also indicate that 24-h model error growth is not statistically significant. In February and August 1997, NCEP implemented upgrades to the eta model's physical parameterizations that were designed to change some of the model's error characteristics near the surface. The results shown in this paper indicate that these upgrades led to identifiable and statistically significant changes in forecast accuracy for selected surface parameters. While some of the changes were expected, others were not consistent with the intent of the model updates and further emphasize the need for ongoing sensitivity studies and localized statistical verification efforts. Objective verification of point forecasts is a stringent measure of model performance, but when used alone, is not enough to quantify the overall value that model guidance may add to the forecast process. Therefore, results from a subjective verification of the meso-eta model over the Florida peninsula are discussed in the companion paper by Manobianco and Nutter. Overall verification results presented here and in part two should establish a reasonable benchmark from which model users and developers may pursue the ongoing eta model verification strategies in the future.

  13. A thermomechanical constitutive model for cemented granular materials with quantifiable internal variables. Part I-Theory

    NASA Astrophysics Data System (ADS)

    Tengattini, Alessandro; Das, Arghya; Nguyen, Giang D.; Viggiani, Gioacchino; Hall, Stephen A.; Einav, Itai

    2014-10-01

    This is the first of two papers introducing a novel thermomechanical continuum constitutive model for cemented granular materials. Here, we establish the theoretical foundations of the model, and highlight its novelties. At the limit of no cement, the model is fully consistent with the original Breakage Mechanics model. An essential ingredient of the model is the use of measurable and micro-mechanics based internal variables, describing the evolution of the dominant inelastic processes. This imposes a link between the macroscopic mechanical behavior and the statistically averaged evolution of the microstructure. As a consequence this model requires only a few physically identifiable parameters, including those of the original breakage model and new ones describing the cement: its volume fraction, its critical damage energy and bulk stiffness, and the cohesion.

  14. Weak Lensing Peaks in Simulated Light-Cones: Investigating the Coupling between Dark Matter and Dark Energy

    NASA Astrophysics Data System (ADS)

    Giocoli, Carlo; Moscardini, Lauro; Baldi, Marco; Meneghetti, Massimo; Metcalf, Robert B.

    2018-05-01

    In this paper, we study the statistical properties of weak lensing peaks in light-cones generated from cosmological simulations. In order to assess the prospects of such observable as a cosmological probe, we consider simulations that include interacting Dark Energy (hereafter DE) models with coupling term between DE and Dark Matter. Cosmological models that produce a larger population of massive clusters have more numerous high signal-to-noise peaks; among models with comparable numbers of clusters those with more concentrated haloes produce more peaks. The most extreme model under investigation shows a difference in peak counts of about 20% with respect to the reference ΛCDM model. We find that peak statistics can be used to distinguish a coupling DE model from a reference one with the same power spectrum normalisation. The differences in the expansion history and the growth rate of structure formation are reflected in their halo counts, non-linear scale features and, through them, in the properties of the lensing peaks. For a source redshift distribution consistent with the expectations of future space-based wide field surveys, we find that typically seventy percent of the cluster population contributes to weak-lensing peaks with signal-to-noise ratios larger than two, and that the fraction of clusters in peaks approaches one-hundred percent for haloes with redshift z ≤ 0.5. Our analysis demonstrates that peak statistics are an important tool for disentangling DE models by accurately tracing the structure formation processes as a function of the cosmic time.

  15. A Geostatistical Scaling Approach for the Generation of Non Gaussian Random Variables and Increments

    NASA Astrophysics Data System (ADS)

    Guadagnini, Alberto; Neuman, Shlomo P.; Riva, Monica; Panzeri, Marco

    2016-04-01

    We address manifestations of non-Gaussian statistical scaling displayed by many variables, Y, and their (spatial or temporal) increments. Evidence of such behavior includes symmetry of increment distributions at all separation distances (or lags) with sharp peaks and heavy tails which tend to decay asymptotically as lag increases. Variables reported to exhibit such distributions include quantities of direct relevance to hydrogeological sciences, e.g. porosity, log permeability, electrical resistivity, soil and sediment texture, sediment transport rate, rainfall, measured and simulated turbulent fluid velocity, and other. No model known to us captures all of the documented statistical scaling behaviors in a unique and consistent manner. We recently proposed a generalized sub-Gaussian model (GSG) which reconciles within a unique theoretical framework the probability distributions of a target variable and its increments. We presented an algorithm to generate unconditional random realizations of statistically isotropic or anisotropic GSG functions and illustrated it in two dimensions. In this context, we demonstrated the feasibility of estimating all key parameters of a GSG model underlying a single realization of Y by analyzing jointly spatial moments of Y data and corresponding increments. Here, we extend our GSG model to account for noisy measurements of Y at a discrete set of points in space (or time), present an algorithm to generate conditional realizations of corresponding isotropic or anisotropic random field, and explore them on one- and two-dimensional synthetic test cases.

  16. Detailed Spectral Analysis of the 260 ks XMM-Newton Data of 1E 1207.4-5209 and Significance of a 2.1 keV Absorption Feature

    NASA Astrophysics Data System (ADS)

    Mori, Kaya; Chonko, James C.; Hailey, Charles J.

    2005-10-01

    We have reanalyzed the 260 ks XMM-Newton observation of 1E 1207.4-5209. There are several significant improvements over previous work. First, a much broader range of physically plausible spectral models was used. Second, we have used a more rigorous statistical analysis. The standard F-distribution was not employed, but rather the exact finite statistics F-distribution was determined by Monte Carlo simulations. This approach was motivated by the recent work of Protassov and coworkers and Freeman and coworkers. They demonstrated that the standard F-distribution is not even asymptotically correct when applied to assess the significance of additional absorption features in a spectrum. With our improved analysis we do not find a third and fourth spectral feature in 1E 1207.4-5209 but only the two broad absorption features previously reported. Two additional statistical tests, one line model dependent and the other line model independent, confirmed our modified F-test analysis. For all physically plausible continuum models in which the weak residuals are strong enough to fit, the residuals occur at the instrument Au M edge. As a sanity check we confirmed that the residuals are consistent in strength and position with the instrument Au M residuals observed in 3C 273.

  17. Study of angular momentum variation due to entrance channel effect in heavy ion fusion reactions

    NASA Astrophysics Data System (ADS)

    Kumar, Ajay

    2014-05-01

    A systematic investigation of the properties of hot nuclei may be studied by detecting the evaporated particles. These emissions reflect the behavior of the nucleus at various stages of the deexcitation cascade. When the nucleus is formed by the collision of a heavy nucleus with a light particle, the statistical model has done a good job of predicting the distribution of evaporated particles when reasonable choices were made for the level densities and yrast lines. Comparison to more specific measurements could, of course, provide a more severe test of the model and enable one to identify the deviations from the statistical model as the signature of other effects not included in the model. Some papers have claimed that experimental evaporation spectra from heavy-ion fusion reactions at higher excitation energies and angular momenta are no longer consistent with the predictions of the standard statistical model. In order to confirm this prediction we have employed two systems, a mass-symmetric (31P+45Sc) and a mass-asymmetric channel (12C+64Zn), leading to the same compound nucleus 76Kr* at the excitation energy of 75 MeV. Neutron energy spectra of the asymmetric system (12C+64Zn) at different angles are well described by the statistical model predictions using the normal value of the level density parameter a = A/8 MeV-1. However, in the case of the symmetric system (31P+45Sc), the statistical model interpretation of the data requires the change in the value of a = A/10 MeV-1. The delayed evolution of the compound system in case of the symmetric 31P+45Sc system may lead to the formation of a temperature equilibrated dinuclear complex, which may be responsible for the neutron emission at higher temperature, while the protons and alpha particles are evaporated after neutron emission when the system is sufficiently cooled down and the higher g-values do not contribute in the formation of the compound nucleus for the symmetric entrance channel in case of charged particle emission.

  18. [Establishment of the mathematic model of total quantum statistical moment standard similarity for application to medical theoretical research].

    PubMed

    He, Fu-yuan; Deng, Kai-wen; Huang, Sheng; Liu, Wen-long; Shi, Ji-lian

    2013-09-01

    The paper aims to elucidate and establish a new mathematic model: the total quantum statistical moment standard similarity (TQSMSS) on the base of the original total quantum statistical moment model and to illustrate the application of the model to medical theoretical research. The model was established combined with the statistical moment principle and the normal distribution probability density function properties, then validated and illustrated by the pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical method for them, and by analysis of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving the Buyanghanwu-decoction extract. The established model consists of four mainly parameters: (1) total quantum statistical moment similarity as ST, an overlapped area by two normal distribution probability density curves in conversion of the two TQSM parameters; (2) total variability as DT, a confidence limit of standard normal accumulation probability which is equal to the absolute difference value between the two normal accumulation probabilities within integration of their curve nodical; (3) total variable probability as 1-Ss, standard normal distribution probability within interval of D(T); (4) total variable probability (1-beta)alpha and (5) stable confident probability beta(1-alpha): the correct probability to make positive and negative conclusions under confident coefficient alpha. With the model, we had analyzed the TQSMS similarities of pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical methods for them were at range of 0.3852-0.9875 that illuminated different pharmacokinetic behaviors of each other; and the TQSMS similarities (ST) of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving Buyanghuanwu-decoction-extract were at range of 0.6842-0.999 2 that showed different constituents with various solvent extracts. The TQSMSS can characterize the sample similarity, by which we can quantitate the correct probability with the test of power under to make positive and negative conclusions no matter the samples come from same population under confident coefficient a or not, by which we can realize an analysis at both macroscopic and microcosmic levels, as an important similar analytical method for medical theoretical research.

  19. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes

    PubMed Central

    2017-01-01

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274, 1926–1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105, 2745–2750; Thiessen & Yee 2010 Child Development 81, 1287–1303; Saffran 2002 Journal of Memory and Language 47, 172–196; Misyak & Christiansen 2012 Language Learning 62, 302–331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39, 246–263; Thiessen et al. 2013 Psychological Bulletin 139, 792–814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37, 310–343). This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences'. PMID:27872374

  20. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

    PubMed

    Thiessen, Erik D

    2017-01-05

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37: , 310-343).This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  1. Dynamic balance in turbulent reconnection

    NASA Astrophysics Data System (ADS)

    Yokoi, N.; Higashimori, K.; Hoshino, M.

    2012-12-01

    Dynamic balance between the enhancement and suppression of transports due to turbulence in magnetic reconnection is discussed analytically and numerically by considering the interaction of the large-scale field structures with the small-scale turbulence in a consistent manner. Turbulence is expected to play an important role in bridging small and large scales related to magnetic reconnection. The configurations of the mean-field structure are determined by turbulence through the effective transport. At the same time, statistical properties of turbulence are determined by the mean-field structure through the production mechanisms of turbulence. This suggests that turbulence and mean fields should be considered simultaneously in a self-consistent manner. Following the theoretical prediction on the interaction between the mean-fields and turbulence in magnetic reconnection presented by Yokoi and Hoshino (2011), a self-consistent model for the turbulent reconnection is constructed. In the model, the mean-field equations for compressible magnetohydrodynamics are treated with the turbulence effects incorporated through the turbulence correlation such as the Reynolds stress and turbulent electromotive force. Transport coefficients appearing in the expression for these correlations are not adjustable parameters but are determined through the transport equations of the turbulent statistical quantities such as the turbulent MHD energy, the turbulent cross helicity. One of the prominent features of this reconnection model lies in the point that turbulence is not implemented as a prescribed one, but the generation and sustainment of turbulence through the mean-field inhomogeneities are treated. The theoretical predictions are confirmed by the numerical simulation of the model equations. These predictions include the quadrupole cross helicity distribution around the reconnection region, enhancement of reconnection rate due to turbulence, localization of the reconnection region through the cross-helicity effect, etc. Some implications to the satellite observation of the magnetic reconnection will be also given. Reference: Yokoi, N. and Hoshino, M. (2011) Physics of Plasmas, 18, 111208.

  2. Gate line edge roughness amplitude and frequency variation effects on intra die MOS device characteristics

    NASA Astrophysics Data System (ADS)

    Hamadeh, Emad; Gunther, Norman G.; Niemann, Darrell; Rahman, Mahmud

    2006-06-01

    Random fluctuations in fabrication process outcomes such as gate line edge roughness (LER) give rise to corresponding fluctuations in scaled down MOS device characteristics. A thermodynamic-variational model is presented to study the effects of LER on threshold voltage and capacitance of sub-50 nm MOS devices. Conceptually, we treat the geometric definition of the MOS devices on a die as consisting of a collection of gates. In turn, each of these gates has an area, A, and a perimeter, P, defined by nominally straight lines subject to random process outcomes producing roughness. We treat roughness as being deviations from straightness consisting of both transverse amplitude and longitudinal wavelength each having lognormal distribution. We obtain closed-form expressions for variance of threshold voltage ( Vth), and device capacitance ( C) at Onset of Strong Inversion (OSI) for a small device. Using our variational model, we characterized the device electrical properties such as σ and σC in terms of the statistical parameters of the roughness amplitude and spatial frequency, i.e., inverse roughness wavelength. We then verified our model with numerical analysis of Vth roll-off for small devices and σ due to dopant fluctuation. Our model was also benchmarked against TCAD of σ as a function of LER. We then extended our analysis to predict variations in σ and σC versus average LER spatial frequency and amplitude, and oxide-thickness. Given the intuitive expectation that LER of very short wavelengths must also have small amplitude, we have investigated the case in which the amplitude mean is inversely related to the frequency mean. We compare with the situation in which amplitude and frequency mean are unrelated. Given also that the gate perimeter may consist of different LER signature for each side, we have extended our analysis to the case when the LER statistical difference between gate sides is moderate, as well as when it is significantly large.

  3. Impact of covariate models on the assessment of the air pollution-mortality association in a single- and multipollutant context.

    PubMed

    Sacks, Jason D; Ito, Kazuhiko; Wilson, William E; Neas, Lucas M

    2012-10-01

    With the advent of multicity studies, uniform statistical approaches have been developed to examine air pollution-mortality associations across cities. To assess the sensitivity of the air pollution-mortality association to different model specifications in a single and multipollutant context, the authors applied various regression models developed in previous multicity time-series studies of air pollution and mortality to data from Philadelphia, Pennsylvania (May 1992-September 1995). Single-pollutant analyses used daily cardiovascular mortality, fine particulate matter (particles with an aerodynamic diameter ≤2.5 µm; PM(2.5)), speciated PM(2.5), and gaseous pollutant data, while multipollutant analyses used source factors identified through principal component analysis. In single-pollutant analyses, risk estimates were relatively consistent across models for most PM(2.5) components and gaseous pollutants. However, risk estimates were inconsistent for ozone in all-year and warm-season analyses. Principal component analysis yielded factors with species associated with traffic, crustal material, residual oil, and coal. Risk estimates for these factors exhibited less sensitivity to alternative regression models compared with single-pollutant models. Factors associated with traffic and crustal material showed consistently positive associations in the warm season, while the coal combustion factor showed consistently positive associations in the cold season. Overall, mortality risk estimates examined using a source-oriented approach yielded more stable and precise risk estimates, compared with single-pollutant analyses.

  4. Inferring Instantaneous, Multivariate and Nonlinear Sensitivities for the Analysis of Feedback Processes in a Dynamical System: Lorenz Model Case Study

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Hansen, James E. (Technical Monitor)

    2001-01-01

    A new approach is presented for the analysis of feedback processes in a nonlinear dynamical system by observing its variations. The new methodology consists of statistical estimates of the sensitivities between all pairs of variables in the system based on a neural network modeling of the dynamical system. The model can then be used to estimate the instantaneous, multivariate and nonlinear sensitivities, which are shown to be essential for the analysis of the feedbacks processes involved in the dynamical system. The method is described and tested on synthetic data from the low-order Lorenz circulation model where the correct sensitivities can be evaluated analytically.

  5. A social discounting model based on Tsallis’ statistics

    NASA Astrophysics Data System (ADS)

    Takahashi, Taiki

    2010-09-01

    Social decision making (e.g. social discounting and social preferences) has been attracting attention in economics, econophysics, social physics, behavioral psychology, and neuroeconomics. This paper proposes a novel social discounting model based on the deformed algebra developed in the Tsallis’ non-extensive thermostatistics. Furthermore, it is suggested that this model can be utilized to quantify the degree of consistency in social discounting in humans and analyze the relationships between behavioral tendencies in social discounting and other-regarding economic decision making under game-theoretic conditions. Future directions in the application of the model to studies in econophysics, neuroeconomics, and social physics, as well as real-world problems such as the supply of live organ donations, are discussed.

  6. Galactic dual population models of gamma-ray bursts

    NASA Technical Reports Server (NTRS)

    Higdon, J. C.; Lingenfelter, R. E.

    1994-01-01

    We investigate in more detail the properties of two-population models for gamma-ray bursts in the galactic disk and halo. We calculate the gamma-ray burst statistical properties, mean value of (V/V(sub max)), mean value of cos Theta, and mean value of (sin(exp 2) b), as functions of the detection flux threshold for bursts coming from both Galactic disk and massive halo populations. We consider halo models inferred from the observational constraints on the large-scale Galactic structure and we compare the expected values of mean value of (V/V(sub max)), mean value of cos Theta, and mean value of (sin(exp 2) b), with those measured by Burst and Transient Source Experiment (BATSE) and other detectors. We find that the measured values are consistent with solely Galactic populations having a range of halo distributions, mixed with local disk distributions, which can account for as much as approximately 25% of the observed BATSE bursts. M31 does not contribute to these modeled bursts. We also demonstrate, contrary to recent arguments, that the size-frequency distributions of dual population models are quite consistent with the BATSE observations.

  7. Hybrid modeling as a QbD/PAT tool in process development: an industrial E. coli case study.

    PubMed

    von Stosch, Moritz; Hamelink, Jan-Martijn; Oliveira, Rui

    2016-05-01

    Process understanding is emphasized in the process analytical technology initiative and the quality by design paradigm to be essential for manufacturing of biopharmaceutical products with consistent high quality. A typical approach to developing a process understanding is applying a combination of design of experiments with statistical data analysis. Hybrid semi-parametric modeling is investigated as an alternative method to pure statistical data analysis. The hybrid model framework provides flexibility to select model complexity based on available data and knowledge. Here, a parametric dynamic bioreactor model is integrated with a nonparametric artificial neural network that describes biomass and product formation rates as function of varied fed-batch fermentation conditions for high cell density heterologous protein production with E. coli. Our model can accurately describe biomass growth and product formation across variations in induction temperature, pH and feed rates. The model indicates that while product expression rate is a function of early induction phase conditions, it is negatively impacted as productivity increases. This could correspond with physiological changes due to cytoplasmic product accumulation. Due to the dynamic nature of the model, rational process timing decisions can be made and the impact of temporal variations in process parameters on product formation and process performance can be assessed, which is central for process understanding.

  8. Predicting human skin absorption of chemicals: development of a novel quantitative structure activity relationship.

    PubMed

    Luo, Wen; Medrek, Sarah; Misra, Jatin; Nohynek, Gerhard J

    2007-02-01

    The objective of this study was to construct and validate a quantitative structure-activity relationship model for skin absorption. Such models are valuable tools for screening and prioritization in safety and efficacy evaluation, and risk assessment of drugs and chemicals. A database of 340 chemicals with percutaneous absorption was assembled. Two models were derived from the training set consisting 306 chemicals (90/10 random split). In addition to the experimental K(ow) values, over 300 2D and 3D atomic and molecular descriptors were analyzed using MDL's QsarIS computer program. Subsequently, the models were validated using both internal (leave-one-out) and external validation (test set) procedures. Using the stepwise regression analysis, three molecular descriptors were determined to have significant statistical correlation with K(p) (R2 = 0.8225): logK(ow), X0 (quantification of both molecular size and the degree of skeletal branching), and SsssCH (count of aromatic carbon groups). In conclusion, two models to estimate skin absorption were developed. When compared to other skin absorption QSAR models in the literature, our model incorporated more chemicals and explored a large number of descriptors. Additionally, our models are reasonably predictive and have met both internal and external statistical validations.

  9. Derivation and validation of in-hospital mortality prediction models in ischaemic stroke patients using administrative data.

    PubMed

    Lee, Jason; Morishima, Toshitaka; Kunisawa, Susumu; Sasaki, Noriko; Otsubo, Tetsuya; Ikai, Hiroshi; Imanaka, Yuichi

    2013-01-01

    Stroke and other cerebrovascular diseases are a major cause of death and disability. Predicting in-hospital mortality in ischaemic stroke patients can help to identify high-risk patients and guide treatment approaches. Chart reviews provide important clinical information for mortality prediction, but are laborious and limiting in sample sizes. Administrative data allow for large-scale multi-institutional analyses but lack the necessary clinical information for outcome research. However, administrative claims data in Japan has seen the recent inclusion of patient consciousness and disability information, which may allow more accurate mortality prediction using administrative data alone. The aim of this study was to derive and validate models to predict in-hospital mortality in patients admitted for ischaemic stroke using administrative data. The sample consisted of 21,445 patients from 176 Japanese hospitals, who were randomly divided into derivation and validation subgroups. Multivariable logistic regression models were developed using 7- and 30-day and overall in-hospital mortality as dependent variables. Independent variables included patient age, sex, comorbidities upon admission, Japan Coma Scale (JCS) score, Barthel Index score, modified Rankin Scale (mRS) score, and admissions after hours and on weekends/public holidays. Models were developed in the derivation subgroup, and coefficients from these models were applied to the validation subgroup. Predictive ability was analysed using C-statistics; calibration was evaluated with Hosmer-Lemeshow χ(2) tests. All three models showed predictive abilities similar or surpassing that of chart review-based models. The C-statistics were highest in the 7-day in-hospital mortality prediction model, at 0.906 and 0.901 in the derivation and validation subgroups, respectively. For the 30-day in-hospital mortality prediction models, the C-statistics for the derivation and validation subgroups were 0.893 and 0.872, respectively; in overall in-hospital mortality prediction these values were 0.883 and 0.876. In this study, we have derived and validated in-hospital mortality prediction models for three different time spans using a large population of ischaemic stroke patients in a multi-institutional analysis. The recent inclusion of JCS, Barthel Index, and mRS scores in Japanese administrative data has allowed the prediction of in-hospital mortality with accuracy comparable to that of chart review analyses. The models developed using administrative data had consistently high predictive abilities for all models in both the derivation and validation subgroups. These results have implications in the role of administrative data in future mortality prediction analyses. Copyright © 2013 S. Karger AG, Basel.

  10. Estimating extreme river discharges in Europe through a Bayesian network

    NASA Astrophysics Data System (ADS)

    Paprotny, Dominik; Morales-Nápoles, Oswaldo

    2017-06-01

    Large-scale hydrological modelling of flood hazards requires adequate extreme discharge data. In practise, models based on physics are applied alongside those utilizing only statistical analysis. The former require enormous computational power, while the latter are mostly limited in accuracy and spatial coverage. In this paper we introduce an alternate, statistical approach based on Bayesian networks (BNs), a graphical model for dependent random variables. We use a non-parametric BN to describe the joint distribution of extreme discharges in European rivers and variables representing the geographical characteristics of their catchments. Annual maxima of daily discharges from more than 1800 river gauges (stations with catchment areas ranging from 1.4 to 807 000 km2) were collected, together with information on terrain, land use and local climate. The (conditional) correlations between the variables are modelled through copulas, with the dependency structure defined in the network. The results show that using this method, mean annual maxima and return periods of discharges could be estimated with an accuracy similar to existing studies using physical models for Europe and better than a comparable global statistical model. Performance of the model varies slightly between regions of Europe, but is consistent between different time periods, and remains the same in a split-sample validation. Though discharge prediction under climate change is not the main scope of this paper, the BN was applied to a large domain covering all sizes of rivers in the continent both for present and future climate, as an example. Results show substantial variation in the influence of climate change on river discharges. The model can be used to provide quick estimates of extreme discharges at any location for the purpose of obtaining input information for hydraulic modelling.

  11. Statistical field theory description of inhomogeneous polarizable soft matter

    NASA Astrophysics Data System (ADS)

    Martin, Jonathan M.; Li, Wei; Delaney, Kris T.; Fredrickson, Glenn H.

    2016-10-01

    We present a new molecularly informed statistical field theory model of inhomogeneous polarizable soft matter. The model is based on fluid elements, referred to as beads, that can carry a net monopole of charge at their center of mass and a fixed or induced dipole through a Drude-type distributed charge approach. The beads are thus polarizable and naturally manifest attractive van der Waals interactions. Beyond electrostatic interactions, beads can be given soft repulsions to sustain fluid phases at arbitrary densities. Beads of different types can be mixed or linked into polymers with arbitrary chain models and sequences of charged and uncharged beads. By such an approach, it is possible to construct models suitable for describing a vast range of soft-matter systems including electrolyte and polyelectrolyte solutions, ionic liquids, polymerized ionic liquids, polymer blends, ionomers, and block copolymers, among others. These bead models can be constructed in virtually any ensemble and converted to complex-valued statistical field theories by Hubbard-Stratonovich transforms. One of the fields entering the resulting theories is a fluctuating electrostatic potential; other fields are necessary to decouple non-electrostatic interactions. We elucidate the structure of these field theories, their consistency with macroscopic electrostatic theory in the absence and presence of external electric fields, and the way in which they embed van der Waals interactions and non-uniform dielectric properties. Their suitability as a framework for computational studies of heterogeneous soft matter systems using field-theoretic simulation techniques is discussed.

  12. Statistical field theory description of inhomogeneous polarizable soft matter.

    PubMed

    Martin, Jonathan M; Li, Wei; Delaney, Kris T; Fredrickson, Glenn H

    2016-10-21

    We present a new molecularly informed statistical field theory model of inhomogeneous polarizable soft matter. The model is based on fluid elements, referred to as beads, that can carry a net monopole of charge at their center of mass and a fixed or induced dipole through a Drude-type distributed charge approach. The beads are thus polarizable and naturally manifest attractive van der Waals interactions. Beyond electrostatic interactions, beads can be given soft repulsions to sustain fluid phases at arbitrary densities. Beads of different types can be mixed or linked into polymers with arbitrary chain models and sequences of charged and uncharged beads. By such an approach, it is possible to construct models suitable for describing a vast range of soft-matter systems including electrolyte and polyelectrolyte solutions, ionic liquids, polymerized ionic liquids, polymer blends, ionomers, and block copolymers, among others. These bead models can be constructed in virtually any ensemble and converted to complex-valued statistical field theories by Hubbard-Stratonovich transforms. One of the fields entering the resulting theories is a fluctuating electrostatic potential; other fields are necessary to decouple non-electrostatic interactions. We elucidate the structure of these field theories, their consistency with macroscopic electrostatic theory in the absence and presence of external electric fields, and the way in which they embed van der Waals interactions and non-uniform dielectric properties. Their suitability as a framework for computational studies of heterogeneous soft matter systems using field-theoretic simulation techniques is discussed.

  13. Explorations in statistics: the log transformation.

    PubMed

    Curran-Everett, Douglas

    2018-06-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.

  14. Analysis of nutrition judgments using the Nutrition Facts Panel.

    PubMed

    González-Vallejo, Claudia; Lavins, Bethany D; Carter, Kristina A

    2016-10-01

    Consumers' judgments and choices of the nutritional value of food products (cereals and snacks) were studied as a function of using information in the Nutrition Facts Panel (NFP, National Labeling and Education Act, 1990). Brunswik's lens model (Brunswik, 1955; Cooksey, 1996; Hammond, 1955; Stewart, 1988) served as the theoretical and analytical tool for examining the judgment process. Lens model analysis was further enriched with the criticality of predictors' technique developed by Azen, Budescu, & Reiser (2001). Judgment accuracy was defined as correspondence between consumers' judgments and the nutritional quality index, NuVal(®), obtained from an expert system. The study also examined several individual level variables (e.g., age, gender, BMI, educational level, health status, health beliefs, etc.) as predictors of lens model indices that measure judgment consistency, judgment accuracy, and knowledge of the environment. Results showed varying levels of consistency and accuracy depending on the food product, but generally the median values of the lens model statistics were moderate. Judgment consistency was higher for more educated individuals; judgment accuracy was predicted from a combination of person level characteristics, and individuals who reported having regular meals had models that were in greater agreement with the expert's model. Lens model methodology is a useful tool for understanding how individuals perceive the nutrition in foods based on the NFP label. Lens model judgment indices were generally low, highlighting that the benefits of the complex NFP label may be more modest than what has been previously assumed. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Quantitative structure-retention relationships of polycyclic aromatic hydrocarbons gas-chromatographic retention indices.

    PubMed

    Drosos, Juan Carlos; Viola-Rhenals, Maricela; Vivas-Reyes, Ricardo

    2010-06-25

    Polycyclic aromatic compounds (PAHs) are of concern in environmental chemistry and toxicology. In the present work, a QSRR study was performed for 209 previously reported PAHs using quantum mechanics and other sources descriptors estimated by different approaches. The B3LYP/6-31G* level of theory was used for geometrical optimization and quantum mechanics related variables. A good linear relationship between gas-chromatographic retention index and electronic or topologic descriptors was found by stepwise linear regression analysis. The molecular polarizability (alpha) and the second order molecular connectivity Kier and Hall index ((2)chi) showed evidence of significant correlation with retention index by means of important squared coefficient of determination, (R(2)), values (R(2)=0.950 and 0.962, respectively). A one variable QSRR model is presented for each descriptor and both models demonstrates a significant predictive capacity established using the leave-many-out LMO (excluding 25% of rows) cross validation method's q(2) cross-validation coefficients q(2)(CV-LMO25%), (obtained q(2)(CV-LMO25%) 0.947 and 0.960, respectively). Furthermore, the physicochemical interpretation of selected descriptors allowed detailed explanation of the source of the observed statistical correlation. The model analysis suggests that only one descriptor is sufficient to establish a consistent retention index-structure relationship. Moderate or non-significant improve was observed for quantitative results or statistical validation parameters when introducing more terms in predictive equation. The one parameter QSRR proposed model offers a consistent scheme to predict chromatographic properties of PAHs compounds. Copyright 2010 Elsevier B.V. All rights reserved.

  16. Importance of regional variation in conservation planning: A rangewide example of the Greater Sage-Grouse

    USGS Publications Warehouse

    Doherty, Kevin E.; Evans, Jeffrey S.; Coates, Peter S.; Juliusson, Lara; Fedy, Bradley C.

    2016-01-01

    We developed rangewide population and habitat models for Greater Sage-Grouse (Centrocercus urophasianus) that account for regional variation in habitat selection and relative densities of birds for use in conservation planning and risk assessments. We developed a probabilistic model of occupied breeding habitat by statistically linking habitat characteristics within 4 miles of an occupied lek using a nonlinear machine learning technique (Random Forests). Habitat characteristics used were quantified in GIS and represent standard abiotic and biotic variables related to sage-grouse biology. Statistical model fit was high (mean correctly classified = 82.0%, range = 75.4–88.0%) as were cross-validation statistics (mean = 80.9%, range = 75.1–85.8%). We also developed a spatially explicit model to quantify the relative density of breeding birds across each Greater Sage-Grouse management zone. The models demonstrate distinct clustering of relative abundance of sage-grouse populations across all management zones. On average, approximately half of the breeding population is predicted to be within 10% of the occupied range. We also found that 80% of sage-grouse populations were contained in 25–34% of the occupied range within each management zone. Our rangewide population and habitat models account for regional variation in habitat selection and the relative densities of birds, and thus, they can serve as a consistent and common currency to assess how sage-grouse habitat and populations overlap with conservation actions or threats over the entire sage-grouse range. We also quantified differences in functional habitat responses and disturbance thresholds across the Western Association of Fish and Wildlife Agencies (WAFWA) management zones using statistical relationships identified during habitat modeling. Even for a species as specialized as Greater Sage-Grouse, our results show that ecological context matters in both the strength of habitat selection (i.e., functional response curves) and response to disturbance.

  17. Adaptive Error Estimation in Linearized Ocean General Circulation Models

    NASA Technical Reports Server (NTRS)

    Chechelnitsky, Michael Y.

    1999-01-01

    Data assimilation methods are routinely used in oceanography. The statistics of the model and measurement errors need to be specified a priori. This study addresses the problem of estimating model and measurement error statistics from observations. We start by testing innovation based methods of adaptive error estimation with low-dimensional models in the North Pacific (5-60 deg N, 132-252 deg E) to TOPEX/POSEIDON (TIP) sea level anomaly data, acoustic tomography data from the ATOC project, and the MIT General Circulation Model (GCM). A reduced state linear model that describes large scale internal (baroclinic) error dynamics is used. The methods are shown to be sensitive to the initial guess for the error statistics and the type of observations. A new off-line approach is developed, the covariance matching approach (CMA), where covariance matrices of model-data residuals are "matched" to their theoretical expectations using familiar least squares methods. This method uses observations directly instead of the innovations sequence and is shown to be related to the MT method and the method of Fu et al. (1993). Twin experiments using the same linearized MIT GCM suggest that altimetric data are ill-suited to the estimation of internal GCM errors, but that such estimates can in theory be obtained using acoustic data. The CMA is then applied to T/P sea level anomaly data and a linearization of a global GFDL GCM which uses two vertical modes. We show that the CMA method can be used with a global model and a global data set, and that the estimates of the error statistics are robust. We show that the fraction of the GCM-T/P residual variance explained by the model error is larger than that derived in Fukumori et al.(1999) with the method of Fu et al.(1993). Most of the model error is explained by the barotropic mode. However, we find that impact of the change in the error statistics on the data assimilation estimates is very small. This is explained by the large representation error, i.e. the dominance of the mesoscale eddies in the T/P signal, which are not part of the 21 by 1" GCM. Therefore, the impact of the observations on the assimilation is very small even after the adjustment of the error statistics. This work demonstrates that simult&neous estimation of the model and measurement error statistics for data assimilation with global ocean data sets and linearized GCMs is possible. However, the error covariance estimation problem is in general highly underdetermined, much more so than the state estimation problem. In other words there exist a very large number of statistical models that can be made consistent with the available data. Therefore, methods for obtaining quantitative error estimates, powerful though they may be, cannot replace physical insight. Used in the right context, as a tool for guiding the choice of a small number of model error parameters, covariance matching can be a useful addition to the repertory of tools available to oceanographers.

  18. Identifying mechanisms for superdiffusive dynamics in cell trajectories

    NASA Astrophysics Data System (ADS)

    Passucci, Giuseppe; Brasch, Megan; Henderson, James; Manning, M. Lisa

    Self-propelled particle (SPP) models have been used to explore features of active matter such as motility-induced phase separation, jamming, and flocking, and are often used to model biological cells. However, many cells exhibit super-diffusive trajectories, where displacements scale faster than t 1 / 2 in all directions, and these are not captured by traditional SPP models. We extract cell trajectories from image stacks of mouse fibroblast cells moving on 2D substrates and find super-diffusive mean-squared displacements in all directions across varying densities. Two SPP model modifications have been proposed to capture super-diffusive dynamics: Levy walks and heterogeneous motility parameters. In mouse fibroblast cells displacement probability distributions collapse when time is rescaled by a power greater than 1/2, which is consistent with Levy walks. We show that a simple SPP model with heterogeneous rotational noise can also generate a similar collapse. Furthermore, a close examination of statistics extracted directly from cell trajectories is consistent with a heterogeneous mobility SPP model and inconsistent with a Levy walk model. Our work demonstrates that a simple set of analyses can distinguish between mechanisms for anomalous diffusion in active matter.

  19. Bayesian hierarchical modelling of North Atlantic windiness

    NASA Astrophysics Data System (ADS)

    Vanem, E.; Breivik, O. N.

    2013-03-01

    Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.

  20. A review of statistical estimators for risk-adjusted length of stay: analysis of the Australian and new Zealand Intensive Care Adult Patient Data-Base, 2008-2009.

    PubMed

    Moran, John L; Solomon, Patricia J

    2012-05-16

    For the analysis of length-of-stay (LOS) data, which is characteristically right-skewed, a number of statistical estimators have been proposed as alternatives to the traditional ordinary least squares (OLS) regression with log dependent variable. Using a cohort of patients identified in the Australian and New Zealand Intensive Care Society Adult Patient Database, 2008-2009, 12 different methods were used for estimation of intensive care (ICU) length of stay. These encompassed risk-adjusted regression analysis of firstly: log LOS using OLS, linear mixed model [LMM], treatment effects, skew-normal and skew-t models; and secondly: unmodified (raw) LOS via OLS, generalised linear models [GLMs] with log-link and 4 different distributions [Poisson, gamma, negative binomial and inverse-Gaussian], extended estimating equations [EEE] and a finite mixture model including a gamma distribution. A fixed covariate list and ICU-site clustering with robust variance were utilised for model fitting with split-sample determination (80%) and validation (20%) data sets, and model simulation was undertaken to establish over-fitting (Copas test). Indices of model specification using Bayesian information criterion [BIC: lower values preferred] and residual analysis as well as predictive performance (R2, concordance correlation coefficient (CCC), mean absolute error [MAE]) were established for each estimator. The data-set consisted of 111663 patients from 131 ICUs; with mean(SD) age 60.6(18.8) years, 43.0% were female, 40.7% were mechanically ventilated and ICU mortality was 7.8%. ICU length-of-stay was 3.4(5.1) (median 1.8, range (0.17-60)) days and demonstrated marked kurtosis and right skew (29.4 and 4.4 respectively). BIC showed considerable spread, from a maximum of 509801 (OLS-raw scale) to a minimum of 210286 (LMM). R2 ranged from 0.22 (LMM) to 0.17 and the CCC from 0.334 (LMM) to 0.149, with MAE 2.2-2.4. Superior residual behaviour was established for the log-scale estimators. There was a general tendency for over-prediction (negative residuals) and for over-fitting, the exception being the GLM negative binomial estimator. The mean-variance function was best approximated by a quadratic function, consistent with log-scale estimation; the link function was estimated (EEE) as 0.152(0.019, 0.285), consistent with a fractional-root function. For ICU length of stay, log-scale estimation, in particular the LMM, appeared to be the most consistently performing estimator(s). Neither the GLM variants nor the skew-regression estimators dominated.

  1. A Novel Signal Modeling Approach for Classification of Seizure and Seizure-Free EEG Signals.

    PubMed

    Gupta, Anubha; Singh, Pushpendra; Karlekar, Mandar

    2018-05-01

    This paper presents a signal modeling-based new methodology of automatic seizure detection in EEG signals. The proposed method consists of three stages. First, a multirate filterbank structure is proposed that is constructed using the basis vectors of discrete cosine transform. The proposed filterbank decomposes EEG signals into its respective brain rhythms: delta, theta, alpha, beta, and gamma. Second, these brain rhythms are statistically modeled with the class of self-similar Gaussian random processes, namely, fractional Brownian motion and fractional Gaussian noises. The statistics of these processes are modeled using a single parameter called the Hurst exponent. In the last stage, the value of Hurst exponent and autoregressive moving average parameters are used as features to design a binary support vector machine classifier to classify pre-ictal, inter-ictal (epileptic with seizure free interval), and ictal (seizure) EEG segments. The performance of the classifier is assessed via extensive analysis on two widely used data set and is observed to provide good accuracy on both the data set. Thus, this paper proposes a novel signal model for EEG data that best captures the attributes of these signals and hence, allows to boost the classification accuracy of seizure and seizure-free epochs.

  2. A methodology for the stochastic generation of hourly synthetic direct normal irradiation time series

    NASA Astrophysics Data System (ADS)

    Larrañeta, M.; Moreno-Tejera, S.; Lillo-Bravo, I.; Silva-Pérez, M. A.

    2018-02-01

    Many of the available solar radiation databases only provide global horizontal irradiance (GHI) while there is a growing need of extensive databases of direct normal radiation (DNI) mainly for the development of concentrated solar power and concentrated photovoltaic technologies. In the present work, we propose a methodology for the generation of synthetic DNI hourly data from the hourly average GHI values by dividing the irradiance into a deterministic and stochastic component intending to emulate the dynamics of the solar radiation. The deterministic component is modeled through a simple classical model. The stochastic component is fitted to measured data in order to maintain the consistency of the synthetic data with the state of the sky, generating statistically significant DNI data with a cumulative frequency distribution very similar to the measured data. The adaptation and application of the model to the location of Seville shows significant improvements in terms of frequency distribution over the classical models. The proposed methodology applied to other locations with different climatological characteristics better results than the classical models in terms of frequency distribution reaching a reduction of the 50% in the Finkelstein-Schafer (FS) and Kolmogorov-Smirnov test integral (KSI) statistics.

  3. Predictive Modeling of Human Perception Subjectivity: Feasibility Study of Mammographic Lesion Similarity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Songhua; Tourassi, Georgia

    2012-01-01

    The majority of clinical content-based image retrieval (CBIR) studies disregard human perception subjectivity, aiming to duplicate the consensus expert assessment of the visual similarity on example cases. The purpose of our study is twofold: (i) discern better the extent of human perception subjectivity when assessing the visual similarity of two images with similar semantic content, and (ii) explore the feasibility of personalized predictive modeling of visual similarity. We conducted a human observer study in which five observers of various expertise were shown ninety-nine triplets of mammographic masses with similar BI-RADS descriptors and were asked to select the two masses withmore » the highest visual relevance. Pairwise agreement ranged between poor and fair among the five observers, as assessed by the kappa statistic. The observers' self-consistency rate was remarkably low, based on repeated questions where either the orientation or the presentation order of a mass was changed. Various machine learning algorithms were explored to determine whether they can predict each observer's personalized selection using textural features. Many algorithms performed with accuracy that exceeded each observer's self-consistency rate, as determined using a cross-validation scheme. This accuracy was statistically significantly higher than would be expected by chance alone (two-tailed p-value ranged between 0.001 and 0.01 for all five personalized models). The study confirmed that human perception subjectivity should be taken into account when developing CBIR-based medical applications.« less

  4. Validation of the Brazilian version of the 'Spanish Burnout Inventory' in teachers.

    PubMed

    Gil-Monte, Pedro R; Carlotto, Mary Sandra; Câmara, Sheila Gonçalves

    2010-02-01

    To assess factorial validity and internal consistency of the Brazilian version of the 'Spanish Burnout Inventory' (SBI). The translation process of the SBI into Brazilian Portuguese included translation, back translation, and semantic equivalence. A confirmatory factor analysis was carried out using a four-factor model, which was similar to the original SBI. The sample consisted of 714 teachers working in schools in the metropolitan area of the city of Porto Alegre, Southern Brazil, in 2008. The instrument comprises 20 items and four subscales: Enthusiasm towards job (5 items), Psychological exhaustion (4 items), Indolence (6 items), and Guilt (5 items). The model was analyzed using LISREL 8. Goodness-of-Fit statistics showed that the hypothesized model had adequate fit: chi2(164) = 605.86 (p<0.000); Goodness-of-Fit Index = 0.92; Adjusted Goodness-of-Fit Index = 0.90; Root Mean Square Error of Approximation = 0.062; Nonnormed Fit Index = 0.91; Comparative Fit Index = 0.92; and Parsimony Normed Fit Index = 0.77. Cronbach's alpha measures for all subscales were higher than 0.70. The study showed that the SBI has adequate factorial validity and internal consistency to assess burnout in Brazilian teachers.

  5. Geospace environment modeling 2008--2009 challenge: Dst index

    USGS Publications Warehouse

    Rastätter, L.; Kuznetsova, M.M.; Glocer, A.; Welling, D.; Meng, X.; Raeder, J.; Wittberger, M.; Jordanova, V.K.; Yu, Y.; Zaharia, S.; Weigel, R.S.; Sazykin, S.; Boynton, R.; Wei, H.; Eccles, V.; Horton, W.; Mays, M.L.; Gannon, J.

    2013-01-01

    This paper reports the metrics-based results of the Dst index part of the 2008–2009 GEM Metrics Challenge. The 2008–2009 GEM Metrics Challenge asked modelers to submit results for four geomagnetic storm events and five different types of observations that can be modeled by statistical, climatological or physics-based models of the magnetosphere-ionosphere system. We present the results of 30 model settings that were run at the Community Coordinated Modeling Center and at the institutions of various modelers for these events. To measure the performance of each of the models against the observations, we use comparisons of 1 hour averaged model data with the Dst index issued by the World Data Center for Geomagnetism, Kyoto, Japan, and direct comparison of 1 minute model data with the 1 minute Dst index calculated by the United States Geological Survey. The latter index can be used to calculate spectral variability of model outputs in comparison to the index. We find that model rankings vary widely by skill score used. None of the models consistently perform best for all events. We find that empirical models perform well in general. Magnetohydrodynamics-based models of the global magnetosphere with inner magnetosphere physics (ring current model) included and stand-alone ring current models with properly defined boundary conditions perform well and are able to match or surpass results from empirical models. Unlike in similar studies, the statistical models used in this study found their challenge in the weakest events rather than the strongest events.

  6. The discounting model selector: Statistical software for delay discounting applications.

    PubMed

    Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A

    2017-05-01

    Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.

  7. Using statistical model to simulate the impact of climate change on maize yield with climate and crop uncertainties

    NASA Astrophysics Data System (ADS)

    Zhang, Yi; Zhao, Yanxia; Wang, Chunyi; Chen, Sining

    2017-11-01

    Assessment of the impact of climate change on crop productions with considering uncertainties is essential for properly identifying and decision-making agricultural practices that are sustainable. In this study, we employed 24 climate projections consisting of the combinations of eight GCMs and three emission scenarios representing the climate projections uncertainty, and two crop statistical models with 100 sets of parameters in each model representing parameter uncertainty within the crop models. The goal of this study was to evaluate the impact of climate change on maize ( Zea mays L.) yield at three locations (Benxi, Changling, and Hailun) across Northeast China (NEC) in periods 2010-2039 and 2040-2069, taking 1976-2005 as the baseline period. The multi-models ensembles method is an effective way to deal with the uncertainties. The results of ensemble simulations showed that maize yield reductions were less than 5 % in both future periods relative to the baseline. To further understand the contributions of individual sources of uncertainty, such as climate projections and crop model parameters, in ensemble yield simulations, variance decomposition was performed. The results indicated that the uncertainty from climate projections was much larger than that contributed by crop model parameters. Increased ensemble yield variance revealed the increasing uncertainty in the yield simulation in the future periods.

  8. Response-Order Effects in Survey Methods: A Randomized Controlled Crossover Study in the Context of Sport Injury Prevention.

    PubMed

    Chan, Derwin K; Ivarsson, Andreas; Stenling, Andreas; Yang, Sophie X; Chatzisarantis, Nikos L; Hagger, Martin S

    2015-12-01

    Consistency tendency is characterized by the propensity for participants responding to subsequent items in a survey consistent with their responses to previous items. This method effect might contaminate the results of sport psychology surveys using cross-sectional design. We present a randomized controlled crossover study examining the effect of consistency tendency on the motivational pathway (i.e., autonomy support → autonomous motivation → intention) of self-determination theory in the context of sport injury prevention. Athletes from Sweden (N = 341) responded to the survey printed in either low interitem distance (IID; consistency tendency likely) or high IID (consistency tendency suppressed) on two separate occasions, with a one-week interim period. Participants were randomly allocated into two groups, and they received the survey of different IID at each occasion. Bayesian structural equation modeling showed that low IID condition had stronger parameter estimates than high IID condition, but the differences were not statistically significant.

  9. Impact of Satellite Viewing-Swath Width on Global and Regional Aerosol Optical Thickness Statistics and Trends

    NASA Technical Reports Server (NTRS)

    Colarco, P. R.; Kahn, R. A.; Remer, L. A.; Levy, R. C.

    2014-01-01

    We use the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite aerosol optical thickness (AOT) product to assess the impact of reduced swath width on global and regional AOT statistics and trends. Alongtrack and across-track sampling strategies are employed, in which the full MODIS data set is sub-sampled with various narrow-swath (approximately 400-800 km) and single pixel width (approximately 10 km) configurations. Although view-angle artifacts in the MODIS AOT retrieval confound direct comparisons between averages derived from different sub-samples, careful analysis shows that with many portions of the Earth essentially unobserved, spatial sampling introduces uncertainty in the derived seasonal-regional mean AOT. These AOT spatial sampling artifacts comprise up to 60%of the full-swath AOT value under moderate aerosol loading, and can be as large as 0.1 in some regions under high aerosol loading. Compared to full-swath observations, narrower swath and single pixel width sampling exhibits a reduced ability to detect AOT trends with statistical significance. On the other hand, estimates of the global, annual mean AOT do not vary significantly from the full-swath values as spatial sampling is reduced. Aggregation of the MODIS data at coarse grid scales (10 deg) shows consistency in the aerosol trends across sampling strategies, with increased statistical confidence, but quantitative errors in the derived trends are found even for the full-swath data when compared to high spatial resolution (0.5 deg) aggregations. Using results of a model-derived aerosol reanalysis, we find consistency in our conclusions about a seasonal-regional spatial sampling artifact in AOT Furthermore, the model shows that reduced spatial sampling can amount to uncertainty in computed shortwave top-ofatmosphere aerosol radiative forcing of 2-3 W m(sup-2). These artifacts are lower bounds, as possibly other unconsidered sampling strategies would perform less well. These results suggest that future aerosol satellite missions having significantly less than full-swath viewing are unlikely to sample the true AOT distribution well enough to obtain the statistics needed to reduce uncertainty in aerosol direct forcing of climate.

  10. For whom will the Bayesian agents vote?

    NASA Astrophysics Data System (ADS)

    Caticha, Nestor; Cesar, Jonatas; Vicente, Renato

    2015-04-01

    Within an agent-based model where moral classifications are socially learned, we ask if a population of agents behaves in a way that may be compared with conservative or liberal positions in the real political spectrum. We assume that agents first experience a formative period, in which they adjust their learning style acting as supervised Bayesian adaptive learners. The formative phase is followed by a period of social influence by reinforcement learning. By comparing data generated by the agents with data from a sample of 15000 Moral Foundation questionnaires we found the following. 1. The number of information exchanges in the formative phase correlates positively with statistics identifying liberals in the social influence phase. This is consistent with recent evidence that connects the dopamine receptor D4-7R gene, political orientation and early age social clique size. 2. The learning algorithms that result from the formative phase vary in the way they treat novelty and corroborative information with more conservative-like agents treating it more equally than liberal-like agents. This is consistent with the correlation between political affiliation and the Openness personality trait reported in the literature. 3. Under the increase of a model parameter interpreted as an external pressure, the statistics of liberal agents resemble more those of conservative agents, consistent with reports on the consequences of external threats on measures of conservatism. We also show that in the social influence phase liberal-like agents readapt much faster than conservative-like agents when subjected to changes on the relevant set of moral issues. This suggests a verifiable dynamical criterium for attaching liberal or conservative labels to groups.

  11. Smart climate ensemble exploring approaches: the example of climate impacts on air pollution in Europe.

    NASA Astrophysics Data System (ADS)

    Lemaire, Vincent; Colette, Augustin; Menut, Laurent

    2016-04-01

    Because of its sensitivity to weather patterns, climate change will have an impact on air pollution so that, in the future, a climate penalty could jeopardize the expected efficiency of air pollution mitigation measures. A common method to assess the impact of climate on air quality consists in implementing chemistry-transport models forced by climate projections. However, at present, such impact assessment lack multi-model ensemble approaches to address uncertainties because of the substantial computing cost. Therefore, as a preliminary step towards exploring large climate ensembles with air quality models, we developed an ensemble exploration technique in order to point out the climate models that should be investigated in priority. By using a training dataset from a deterministic projection of climate and air quality over Europe, we identified the main meteorological drivers of air quality for 8 regions in Europe and developed statistical models that could be used to estimate future air pollutant concentrations. Applying this statistical model to the whole EuroCordex ensemble of climate projection, we find a climate penalty for six subregions out of eight (Eastern Europe, France, Iberian Peninsula, Mid Europe and Northern Italy). On the contrary, a climate benefit for PM2.5 was identified for three regions (Eastern Europe, Mid Europe and Northern Italy). The uncertainty of this statistical model challenges limits however the confidence we can attribute to associated quantitative projections. This technique allows however selecting a subset of relevant regional climate model members that should be used in priority for future deterministic projections to propose an adequate coverage of uncertainties. We are thereby proposing a smart ensemble exploration strategy that can also be used for other impacts studies beyond air quality.

  12. Lattice QCD Thermodynamics and RHIC-BES Particle Production within Generic Nonextensive Statistics

    NASA Astrophysics Data System (ADS)

    Tawfik, Abdel Nasser

    2018-05-01

    The current status of implementing Tsallis (nonextensive) statistics on high-energy physics is briefly reviewed. The remarkably low freezeout-temperature, which apparently fails to reproduce the firstprinciple lattice QCD thermodynamics and the measured particle ratios, etc. is discussed. The present work suggests a novel interpretation for the so-called " Tsallis-temperature". It is proposed that the low Tsallis-temperature is due to incomplete implementation of Tsallis algebra though exponential and logarithmic functions to the high-energy particle-production. Substituting Tsallis algebra into grand-canonical partition-function of the hadron resonance gas model seems not assuring full incorporation of nonextensivity or correlations in that model. The statistics describing the phase-space volume, the number of states and the possible changes in the elementary cells should be rather modified due to interacting correlated subsystems, of which the phase-space is consisting. Alternatively, two asymptotic properties, each is associated with a scaling function, are utilized to classify a generalized entropy for such a system with large ensemble (produced particles) and strong correlations. Both scaling exponents define equivalence classes for all interacting and noninteracting systems and unambiguously characterize any statistical system in its thermodynamic limit. We conclude that the nature of lattice QCD simulations is apparently extensive and accordingly the Boltzmann-Gibbs statistics is fully fulfilled. Furthermore, we found that the ratios of various particle yields at extreme high and extreme low energies of RHIC-BES is likely nonextensive but not necessarily of Tsallis type.

  13. Assessment of six dissimilarity metrics for climate analogues

    NASA Astrophysics Data System (ADS)

    Grenier, Patrick; Parent, Annie-Claude; Huard, David; Anctil, François; Chaumont, Diane

    2013-04-01

    Spatial analogue techniques consist in identifying locations whose recent-past climate is similar in some aspects to the future climate anticipated at a reference location. When identifying analogues, one key step is the quantification of the dissimilarity between two climates separated in time and space, which involves the choice of a metric. In this communication, spatial analogues and their usefulness are briefly discussed. Next, six metrics are presented (the standardized Euclidean distance, the Kolmogorov-Smirnov statistic, the nearest-neighbor distance, the Zech-Aslan energy statistic, the Friedman-Rafsky runs statistic and the Kullback-Leibler divergence), along with a set of criteria used for their assessment. The related case study involves the use of numerical simulations performed with the Canadian Regional Climate Model (CRCM-v4.2.3), from which three annual indicators (total precipitation, heating degree-days and cooling degree-days) are calculated over 30-year periods (1971-2000 and 2041-2070). Results indicate that the six metrics identify comparable analogue regions at a relatively large scale, but best analogues may differ substantially. For best analogues, it is also shown that the uncertainty stemming from the metric choice does generally not exceed that stemming from the simulation or model choice. A synthesis of the advantages and drawbacks of each metric is finally presented, in which the Zech-Aslan energy statistic stands out as the most recommended metric for analogue studies, whereas the Friedman-Rafsky runs statistic is the least recommended, based on this case study.

  14. Robust Dehaze Algorithm for Degraded Image of CMOS Image Sensors.

    PubMed

    Qu, Chen; Bi, Du-Yan; Sui, Ping; Chao, Ai-Nong; Wang, Yun-Fei

    2017-09-22

    The CMOS (Complementary Metal-Oxide-Semiconductor) is a new type of solid image sensor device widely used in object tracking, object recognition, intelligent navigation fields, and so on. However, images captured by outdoor CMOS sensor devices are usually affected by suspended atmospheric particles (such as haze), causing a reduction in image contrast, color distortion problems, and so on. In view of this, we propose a novel dehazing approach based on a local consistent Markov random field (MRF) framework. The neighboring clique in traditional MRF is extended to the non-neighboring clique, which is defined on local consistent blocks based on two clues, where both the atmospheric light and transmission map satisfy the character of local consistency. In this framework, our model can strengthen the restriction of the whole image while incorporating more sophisticated statistical priors, resulting in more expressive power of modeling, thus, solving inadequate detail recovery effectively and alleviating color distortion. Moreover, the local consistent MRF framework can obtain details while maintaining better results for dehazing, which effectively improves the image quality captured by the CMOS image sensor. Experimental results verified that the method proposed has the combined advantages of detail recovery and color preservation.

  15. Simpler score of routine laboratory tests predicts liver fibrosis in patients with chronic hepatitis B.

    PubMed

    Zhou, Kun; Gao, Chun-Fang; Zhao, Yun-Peng; Liu, Hai-Lin; Zheng, Rui-Dan; Xian, Jian-Chun; Xu, Hong-Tao; Mao, Yi-Min; Zeng, Min-De; Lu, Lun-Gen

    2010-09-01

    In recent years, a great interest has been dedicated to the development of noninvasive predictive models to substitute liver biopsy for fibrosis assessment and follow-up. Our aim was to provide a simpler model consisting of routine laboratory markers for predicting liver fibrosis in patients chronically infected with hepatitis B virus (HBV) in order to optimize their clinical management. Liver fibrosis was staged in 386 chronic HBV carriers who underwent liver biopsy and routine laboratory testing. Correlations between routine laboratory markers and fibrosis stage were statistically assessed. After logistic regression analysis, a novel predictive model was constructed. This S index was validated in an independent cohort of 146 chronic HBV carriers in comparison to the SLFG model, Fibrometer, Hepascore, Hui model, Forns score and APRI using receiver operating characteristic (ROC) curves. The diagnostic values of each marker panels were better than single routine laboratory markers. The S index consisting of gamma-glutamyltransferase (GGT), platelets (PLT) and albumin (ALB) (S-index: 1000 x GGT/(PLT x ALB(2))) had a higher diagnostic accuracy in predicting degree of fibrosis than any other mathematical model tested. The areas under the ROC curves (AUROC) were 0.812 and 0.890 for predicting significant fibrosis and cirrhosis in the validation cohort, respectively. The S index, a simpler mathematical model consisting of routine laboratory markers predicts significant fibrosis and cirrhosis in patients with chronic HBV infection with a high degree of accuracy, potentially decreasing the need for liver biopsy.

  16. Consistency among distance measurements: transparency, BAO scale and accelerated expansion

    NASA Astrophysics Data System (ADS)

    Avgoustidis, Anastasios; Verde, Licia; Jimenez, Raul

    2009-06-01

    We explore consistency among different distance measures, including Supernovae Type Ia data, measurements of the Hubble parameter, and determination of the Baryon acoustic oscillation scale. We present new constraints on the cosmic transparency combining H(z) data together with the latest Supernovae Type Ia data compilation. This combination, in the context of a flat ΛCDM model, improves current constraints by nearly an order of magnitude although the constraints presented here are parametric rather than non-parametric. We re-examine the recently reported tension between the Baryon acoustic oscillation scale and Supernovae data in light of possible deviations from transparency, concluding that the source of the discrepancy may most likely be found among systematic effects of the modelling of the low redshift data or a simple ~ 2-σ statistical fluke, rather than in exotic physics. Finally, we attempt to draw model-independent conclusions about the recent accelerated expansion, determining the acceleration redshift to be zacc = 0.35+0.20-0.13 (1-σ).

  17. Statistical performance and information content of time lag analysis and redundancy analysis in time series modeling.

    PubMed

    Angeler, David G; Viedma, Olga; Moreno, José M

    2009-11-01

    Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.

  18. An evaluation of three statistical estimation methods for assessing health policy effects on prescription drug claims.

    PubMed

    Mittal, Manish; Harrison, Donald L; Thompson, David M; Miller, Michael J; Farmer, Kevin C; Ng, Yu-Tze

    2016-01-01

    While the choice of analytical approach affects study results and their interpretation, there is no consensus to guide the choice of statistical approaches to evaluate public health policy change. This study compared and contrasted three statistical estimation procedures in the assessment of a U.S. Food and Drug Administration (FDA) suicidality warning, communicated in January 2008 and implemented in May 2009, on antiepileptic drug (AED) prescription claims. Longitudinal designs were utilized to evaluate Oklahoma (U.S. State) Medicaid claim data from January 2006 through December 2009. The study included 9289 continuously eligible individuals with prevalent diagnoses of epilepsy and/or psychiatric disorder. Segmented regression models using three estimation procedures [i.e., generalized linear models (GLM), generalized estimation equations (GEE), and generalized linear mixed models (GLMM)] were used to estimate trends of AED prescription claims across three time periods: before (January 2006-January 2008); during (February 2008-May 2009); and after (June 2009-December 2009) the FDA warning. All three statistical procedures estimated an increasing trend (P < 0.0001) in AED prescription claims before the FDA warning period. No procedures detected a significant change in trend during (GLM: -30.0%, 99% CI: -60.0% to 10.0%; GEE: -20.0%, 99% CI: -70.0% to 30.0%; GLMM: -23.5%, 99% CI: -58.8% to 1.2%) and after (GLM: 50.0%, 99% CI: -70.0% to 160.0%; GEE: 80.0%, 99% CI: -20.0% to 200.0%; GLMM: 47.1%, 99% CI: -41.2% to 135.3%) the FDA warning when compared to pre-warning period. Although the three procedures provided consistent inferences, the GEE and GLMM approaches accounted appropriately for correlation. Further, marginal models estimated using GEE produced more robust and valid population-level estimations. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Adaptation of a Fast Optimal Interpolation Algorithm to the Mapping of Oceangraphic Data

    NASA Technical Reports Server (NTRS)

    Menemenlis, Dimitris; Fieguth, Paul; Wunsch, Carl; Willsky, Alan

    1997-01-01

    A fast, recently developed, multiscale optimal interpolation algorithm has been adapted to the mapping of hydrographic and other oceanographic data. This algorithm produces solution and error estimates which are consistent with those obtained from exact least squares methods, but at a small fraction of the computational cost. Problems whose solution would be completely impractical using exact least squares, that is, problems with tens or hundreds of thousands of measurements and estimation grid points, can easily be solved on a small workstation using the multiscale algorithm. In contrast to methods previously proposed for solving large least squares problems, our approach provides estimation error statistics while permitting long-range correlations, using all measurements, and permitting arbitrary measurement locations. The multiscale algorithm itself, published elsewhere, is not the focus of this paper. However, the algorithm requires statistical models having a very particular multiscale structure; it is the development of a class of multiscale statistical models, appropriate for oceanographic mapping problems, with which we concern ourselves in this paper. The approach is illustrated by mapping temperature in the northeastern Pacific. The number of hydrographic stations is kept deliberately small to show that multiscale and exact least squares results are comparable. A portion of the data were not used in the analysis; these data serve to test the multiscale estimates. A major advantage of the present approach is the ability to repeat the estimation procedure a large number of times for sensitivity studies, parameter estimation, and model testing. We have made available by anonymous Ftp a set of MATLAB-callable routines which implement the multiscale algorithm and the statistical models developed in this paper.

  20. Grain boundary oxidation and fatigue crack growth at elevated temperatures

    NASA Technical Reports Server (NTRS)

    Liu, H. W.; Oshida, Y.

    1986-01-01

    Fatigue crack growth rate at elevated temperatures can be accelerated by grain boundary oxidation. Grain boundary oxidation kinetics and the statistical distribution of grain boundary oxide penetration depth were studied. At a constant delta K-level and at a constant test temperature, fatigue crack growth rate, da/dN, is a function of cyclic frequency, nu. A fatigue crack growth model of intermittent micro-ruptures of grain boundary oxide is constructed. The model is consistent with the experimental observations that, in the low frequency region, da/dN is inversely proportional to nu, and fatigue crack growth is intergranular.

Top