Sample records for statistical model relating

  1. A Statistical Test for Comparing Nonnested Covariance Structure Models.

    ERIC Educational Resources Information Center

    Levy, Roy; Hancock, Gregory R.

    While statistical procedures are well known for comparing hierarchically related (nested) covariance structure models, statistical tests for comparing nonhierarchically related (nonnested) models have proven more elusive. While isolated attempts have been made, none exists within the commonly used maximum likelihood estimation framework, thereby…

  2. Statistical Learning is Related to Early Literacy-Related Skills

    PubMed Central

    Spencer, Mercedes; Kaschak, Michael P.; Jones, John L.; Lonigan, Christopher J.

    2015-01-01

    It has been demonstrated that statistical learning, or the ability to use statistical information to learn the structure of one’s environment, plays a role in young children’s acquisition of linguistic knowledge. Although most research on statistical learning has focused on language acquisition processes, such as the segmentation of words from fluent speech and the learning of syntactic structure, some recent studies have explored the extent to which individual differences in statistical learning are related to literacy-relevant knowledge and skills. The present study extends on this literature by investigating the relations between two measures of statistical learning and multiple measures of skills that are critical to the development of literacy—oral language, vocabulary knowledge, and phonological processing—within a single model. Our sample included a total of 553 typically developing children from prekindergarten through second grade. Structural equation modeling revealed that statistical learning accounted for a unique portion of the variance in these literacy-related skills. Practical implications for instruction and assessment are discussed. PMID:26478658

  3. Crash Lethality Model

    DTIC Science & Technology

    2012-06-06

    Statistical Data ........................................................................................... 45 31 Parametric Model for Rotor Wing Debris...Area .............................................................. 46 32 Skid Distance Statistical Data...results. The curve that related the BC value to the probability of skull fracture resulted in a tight confidence interval and a two tailed statistical p

  4. Assessment of corneal properties based on statistical modeling of OCT speckle.

    PubMed

    Jesus, Danilo A; Iskander, D Robert

    2017-01-01

    A new approach to assess the properties of the corneal micro-structure in vivo based on the statistical modeling of speckle obtained from Optical Coherence Tomography (OCT) is presented. A number of statistical models were proposed to fit the corneal speckle data obtained from OCT raw image. Short-term changes in corneal properties were studied by inducing corneal swelling whereas age-related changes were observed analyzing data of sixty-five subjects aged between twenty-four and seventy-three years. Generalized Gamma distribution has shown to be the best model, in terms of the Akaike's Information Criterion, to fit the OCT corneal speckle. Its parameters have shown statistically significant differences (Kruskal-Wallis, p < 0.001) for short and age-related corneal changes. In addition, it was observed that age-related changes influence the corneal biomechanical behaviour when corneal swelling is induced. This study shows that Generalized Gamma distribution can be utilized to modeling corneal speckle in OCT in vivo providing complementary quantified information where micro-structure of corneal tissue is of essence.

  5. Evaluating measurement models in clinical research: covariance structure analysis of latent variable models of self-conception.

    PubMed

    Hoyle, R H

    1991-02-01

    Indirect measures of psychological constructs are vital to clinical research. On occasion, however, the meaning of indirect measures of psychological constructs is obfuscated by statistical procedures that do not account for the complex relations between items and latent variables and among latent variables. Covariance structure analysis (CSA) is a statistical procedure for testing hypotheses about the relations among items that indirectly measure a psychological construct and relations among psychological constructs. This article introduces clinical researchers to the strengths and limitations of CSA as a statistical procedure for conceiving and testing structural hypotheses that are not tested adequately with other statistical procedures. The article is organized around two empirical examples that illustrate the use of CSA for evaluating measurement models with correlated error terms, higher-order factors, and measured and latent variables.

  6. Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions.

    PubMed

    Canary, Jana D; Blizzard, Leigh; Barry, Ronald P; Hosmer, David W; Quinn, Stephen J

    2016-05-01

    Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (TG), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2) ) statistics can be applied directly. In a simulation study, TG, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The TG statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, TG had more power than HL or J(2) . © 2015 John Wiley & Sons Ltd/London School of Economics.

  7. “Plateau”-related summary statistics are uninformative for comparing working memory models

    PubMed Central

    van den Berg, Ronald; Ma, Wei Ji

    2014-01-01

    Performance on visual working memory tasks decreases as more items need to be remembered. Over the past decade, a debate has unfolded between proponents of slot models and slotless models of this phenomenon. Zhang and Luck (2008) and Anderson, Vogel, and Awh (2011) noticed that as more items need to be remembered, “memory noise” seems to first increase and then reach a “stable plateau.” They argued that three summary statistics characterizing this plateau are consistent with slot models, but not with slotless models. Here, we assess the validity of their methods. We generated synthetic data both from a leading slot model and from a recent slotless model and quantified model evidence using log Bayes factors. We found that the summary statistics provided, at most, 0.15% of the expected model evidence in the raw data. In a model recovery analysis, a total of more than a million trials were required to achieve 99% correct recovery when models were compared on the basis of summary statistics, whereas fewer than 1,000 trials were sufficient when raw data were used. At realistic numbers of trials, plateau-related summary statistics are completely unreliable for model comparison. Applying the same analyses to subject data from Anderson et al. (2011), we found that the evidence in the summary statistics was, at most, 0.12% of the evidence in the raw data and far too weak to warrant any conclusions. These findings call into question claims about working memory that are based on summary statistics. PMID:24719235

  8. Rasch fit statistics and sample size considerations for polytomous data.

    PubMed

    Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael

    2008-05-29

    Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire - 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.

  9. Rasch fit statistics and sample size considerations for polytomous data

    PubMed Central

    Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael

    2008-01-01

    Background Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Methods Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire – 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. Results The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. Conclusion It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges. PMID:18510722

  10. Assessment of corneal properties based on statistical modeling of OCT speckle

    PubMed Central

    Jesus, Danilo A.; Iskander, D. Robert

    2016-01-01

    A new approach to assess the properties of the corneal micro-structure in vivo based on the statistical modeling of speckle obtained from Optical Coherence Tomography (OCT) is presented. A number of statistical models were proposed to fit the corneal speckle data obtained from OCT raw image. Short-term changes in corneal properties were studied by inducing corneal swelling whereas age-related changes were observed analyzing data of sixty-five subjects aged between twenty-four and seventy-three years. Generalized Gamma distribution has shown to be the best model, in terms of the Akaike’s Information Criterion, to fit the OCT corneal speckle. Its parameters have shown statistically significant differences (Kruskal-Wallis, p < 0.001) for short and age-related corneal changes. In addition, it was observed that age-related changes influence the corneal biomechanical behaviour when corneal swelling is induced. This study shows that Generalized Gamma distribution can be utilized to modeling corneal speckle in OCT in vivo providing complementary quantified information where micro-structure of corneal tissue is of essence. PMID:28101409

  11. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  12. Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

    PubMed

    Dai, Qi; Yang, Yanchun; Wang, Tianming

    2008-10-15

    Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.

  13. An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

    PubMed

    Ng'andu, N H

    1997-03-30

    In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.

  14. "Plateau"-related summary statistics are uninformative for comparing working memory models.

    PubMed

    van den Berg, Ronald; Ma, Wei Ji

    2014-10-01

    Performance on visual working memory tasks decreases as more items need to be remembered. Over the past decade, a debate has unfolded between proponents of slot models and slotless models of this phenomenon (Ma, Husain, Bays (Nature Neuroscience 17, 347-356, 2014). Zhang and Luck (Nature 453, (7192), 233-235, 2008) and Anderson, Vogel, and Awh (Attention, Perception, Psychophys 74, (5), 891-910, 2011) noticed that as more items need to be remembered, "memory noise" seems to first increase and then reach a "stable plateau." They argued that three summary statistics characterizing this plateau are consistent with slot models, but not with slotless models. Here, we assess the validity of their methods. We generated synthetic data both from a leading slot model and from a recent slotless model and quantified model evidence using log Bayes factors. We found that the summary statistics provided at most 0.15 % of the expected model evidence in the raw data. In a model recovery analysis, a total of more than a million trials were required to achieve 99 % correct recovery when models were compared on the basis of summary statistics, whereas fewer than 1,000 trials were sufficient when raw data were used. Therefore, at realistic numbers of trials, plateau-related summary statistics are highly unreliable for model comparison. Applying the same analyses to subject data from Anderson et al. (Attention, Perception, Psychophys 74, (5), 891-910, 2011), we found that the evidence in the summary statistics was at most 0.12 % of the evidence in the raw data and far too weak to warrant any conclusions. The evidence in the raw data, in fact, strongly favored the slotless model. These findings call into question claims about working memory that are based on summary statistics.

  15. A Mediation Model to Explain the Role of Mathematics Skills and Probabilistic Reasoning on Statistics Achievement

    ERIC Educational Resources Information Center

    Primi, Caterina; Donati, Maria Anna; Chiesi, Francesca

    2016-01-01

    Among the wide range of factors related to the acquisition of statistical knowledge, competence in basic mathematics, including basic probability, has received much attention. In this study, a mediation model was estimated to derive the total, direct, and indirect effects of mathematical competence on statistics achievement taking into account…

  16. Hyperparameterization of soil moisture statistical models for North America with Ensemble Learning Models (Elm)

    NASA Astrophysics Data System (ADS)

    Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.

    2017-12-01

    Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.

  17. Statistical models for predicting pair dispersion and particle clustering in isotropic turbulence and their applications

    NASA Astrophysics Data System (ADS)

    Zaichik, Leonid I.; Alipchenkov, Vladimir M.

    2009-10-01

    The purpose of this paper is twofold: (i) to advance and extend the statistical two-point models of pair dispersion and particle clustering in isotropic turbulence that were previously proposed by Zaichik and Alipchenkov (2003 Phys. Fluids15 1776-87 2007 Phys. Fluids 19, 113308) and (ii) to present some applications of these models. The models developed are based on a kinetic equation for the two-point probability density function of the relative velocity distribution of two particles. These models predict the pair relative velocity statistics and the preferential accumulation of heavy particles in stationary and decaying homogeneous isotropic turbulent flows. Moreover, the models are applied to predict the effect of particle clustering on turbulent collisions, sedimentation and intensity of microwave radiation as well as to calculate the mean filtered subgrid stress of the particulate phase. Model predictions are compared with direct numerical simulations and experimental measurements.

  18. Cognitive Components Underpinning the Development of Model-Based Learning

    PubMed Central

    Potter, Tracey C.S.; Bryce, Nessa V.; Hartley, Catherine A.

    2016-01-01

    Reinforcement learning theory distinguishes “model-free” learning, which fosters reflexive repetition of previously rewarded actions, from “model-based” learning, which recruits a mental model of the environment to flexibly select goal-directed actions. Whereas model-free learning is evident across development, recruitment of model-based learning appears to increase with age. However, the cognitive processes underlying the development of model-based learning remain poorly characterized. Here, we examined whether age-related differences in cognitive processes underlying the construction and flexible recruitment of mental models predict developmental increases in model-based choice. In a cohort of participants aged 9–25, we examined whether the abilities to infer sequential regularities in the environment (“statistical learning”), maintain information in an active state (“working memory”) and integrate distant concepts to solve problems (“fluid reasoning”) predicted age-related improvements in model-based choice. We found that age-related improvements in statistical learning performance did not mediate the relationship between age and model-based choice. Ceiling performance on our working memory assay prevented examination of its contribution to model-based learning. However, age-related improvements in fluid reasoning statistically mediated the developmental increase in the recruitment of a model-based strategy. These findings suggest that gradual development of fluid reasoning may be a critical component process underlying the emergence of model-based learning. PMID:27825732

  19. Cognitive components underpinning the development of model-based learning.

    PubMed

    Potter, Tracey C S; Bryce, Nessa V; Hartley, Catherine A

    2017-06-01

    Reinforcement learning theory distinguishes "model-free" learning, which fosters reflexive repetition of previously rewarded actions, from "model-based" learning, which recruits a mental model of the environment to flexibly select goal-directed actions. Whereas model-free learning is evident across development, recruitment of model-based learning appears to increase with age. However, the cognitive processes underlying the development of model-based learning remain poorly characterized. Here, we examined whether age-related differences in cognitive processes underlying the construction and flexible recruitment of mental models predict developmental increases in model-based choice. In a cohort of participants aged 9-25, we examined whether the abilities to infer sequential regularities in the environment ("statistical learning"), maintain information in an active state ("working memory") and integrate distant concepts to solve problems ("fluid reasoning") predicted age-related improvements in model-based choice. We found that age-related improvements in statistical learning performance did not mediate the relationship between age and model-based choice. Ceiling performance on our working memory assay prevented examination of its contribution to model-based learning. However, age-related improvements in fluid reasoning statistically mediated the developmental increase in the recruitment of a model-based strategy. These findings suggest that gradual development of fluid reasoning may be a critical component process underlying the emergence of model-based learning. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Quantum probability, choice in large worlds, and the statistical structure of reality.

    PubMed

    Ross, Don; Ladyman, James

    2013-06-01

    Classical probability models of incentive response are inadequate in "large worlds," where the dimensions of relative risk and the dimensions of similarity in outcome comparisons typically differ. Quantum probability models for choice in large worlds may be motivated pragmatically - there is no third theory - or metaphysically: statistical processing in the brain adapts to the true scale-relative structure of the universe.

  1. An order statistics approach to the halo model for galaxies

    NASA Astrophysics Data System (ADS)

    Paul, Niladri; Paranjape, Aseem; Sheth, Ravi K.

    2017-04-01

    We use the halo model to explore the implications of assuming that galaxy luminosities in groups are randomly drawn from an underlying luminosity function. We show that even the simplest of such order statistics models - one in which this luminosity function p(L) is universal - naturally produces a number of features associated with previous analyses based on the 'central plus Poisson satellites' hypothesis. These include the monotonic relation of mean central luminosity with halo mass, the lognormal distribution around this mean and the tight relation between the central and satellite mass scales. In stark contrast to observations of galaxy clustering; however, this model predicts no luminosity dependence of large-scale clustering. We then show that an extended version of this model, based on the order statistics of a halo mass dependent luminosity function p(L|m), is in much better agreement with the clustering data as well as satellite luminosities, but systematically underpredicts central luminosities. This brings into focus the idea that central galaxies constitute a distinct population that is affected by different physical processes than are the satellites. We model this physical difference as a statistical brightening of the central luminosities, over and above the order statistics prediction. The magnitude gap between the brightest and second brightest group galaxy is predicted as a by-product, and is also in good agreement with observations. We propose that this order statistics framework provides a useful language in which to compare the halo model for galaxies with more physically motivated galaxy formation models.

  2. Statistically significant relational data mining :

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann

    This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publicationsmore » that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.« less

  3. Testing prediction methods: Earthquake clustering versus the Poisson model

    USGS Publications Warehouse

    Michael, A.J.

    1997-01-01

    Testing earthquake prediction methods requires statistical techniques that compare observed success to random chance. One technique is to produce simulated earthquake catalogs and measure the relative success of predicting real and simulated earthquakes. The accuracy of these tests depends on the validity of the statistical model used to simulate the earthquakes. This study tests the effect of clustering in the statistical earthquake model on the results. Three simulation models were used to produce significance levels for a VLF earthquake prediction method. As the degree of simulated clustering increases, the statistical significance drops. Hence, the use of a seismicity model with insufficient clustering can lead to overly optimistic results. A successful method must pass the statistical tests with a model that fully replicates the observed clustering. However, a method can be rejected based on tests with a model that contains insufficient clustering. U.S. copyright. Published in 1997 by the American Geophysical Union.

  4. GIA Model Statistics for GRACE Hydrology, Cryosphere, and Ocean Science

    NASA Astrophysics Data System (ADS)

    Caron, L.; Ivins, E. R.; Larour, E.; Adhikari, S.; Nilsson, J.; Blewitt, G.

    2018-03-01

    We provide a new analysis of glacial isostatic adjustment (GIA) with the goal of assembling the model uncertainty statistics required for rigorously extracting trends in surface mass from the Gravity Recovery and Climate Experiment (GRACE) mission. Such statistics are essential for deciphering sea level, ocean mass, and hydrological changes because the latter signals can be relatively small (≤2 mm/yr water height equivalent) over very large regions, such as major ocean basins and watersheds. With abundant new >7 year continuous measurements of vertical land motion (VLM) reported by Global Positioning System stations on bedrock and new relative sea level records, our new statistical evaluation of GIA uncertainties incorporates Bayesian methodologies. A unique aspect of the method is that both the ice history and 1-D Earth structure vary through a total of 128,000 forward models. We find that best fit models poorly capture the statistical inferences needed to correctly invert for lower mantle viscosity and that GIA uncertainty exceeds the uncertainty ascribed to trends from 14 years of GRACE data in polar regions.

  5. Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.

    PubMed

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei

    2016-02-01

    Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.

  6. Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions

    PubMed Central

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei

    2015-01-01

    Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979

  7. Two statistics for evaluating parameter identifiability and error reduction

    USGS Publications Warehouse

    Doherty, John; Hunt, Randall J.

    2009-01-01

    Two statistics are presented that can be used to rank input parameters utilized by a model in terms of their relative identifiability based on a given or possible future calibration dataset. Identifiability is defined here as the capability of model calibration to constrain parameters used by a model. Both statistics require that the sensitivity of each model parameter be calculated for each model output for which there are actual or presumed field measurements. Singular value decomposition (SVD) of the weighted sensitivity matrix is then undertaken to quantify the relation between the parameters and observations that, in turn, allows selection of calibration solution and null spaces spanned by unit orthogonal vectors. The first statistic presented, "parameter identifiability", is quantitatively defined as the direction cosine between a parameter and its projection onto the calibration solution space. This varies between zero and one, with zero indicating complete non-identifiability and one indicating complete identifiability. The second statistic, "relative error reduction", indicates the extent to which the calibration process reduces error in estimation of a parameter from its pre-calibration level where its value must be assigned purely on the basis of prior expert knowledge. This is more sophisticated than identifiability, in that it takes greater account of the noise associated with the calibration dataset. Like identifiability, it has a maximum value of one (which can only be achieved if there is no measurement noise). Conceptually it can fall to zero; and even below zero if a calibration problem is poorly posed. An example, based on a coupled groundwater/surface-water model, is included that demonstrates the utility of the statistics. ?? 2009 Elsevier B.V.

  8. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    PubMed Central

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  9. A Stochastic Model of Space-Time Variability of Mesoscale Rainfall: Statistics of Spatial Averages

    NASA Technical Reports Server (NTRS)

    Kundu, Prasun K.; Bell, Thomas L.

    2003-01-01

    A characteristic feature of rainfall statistics is that they depend on the space and time scales over which rain data are averaged. A previously developed spectral model of rain statistics that is designed to capture this property, predicts power law scaling behavior for the second moment statistics of area-averaged rain rate on the averaging length scale L as L right arrow 0. In the present work a more efficient method of estimating the model parameters is presented, and used to fit the model to the statistics of area-averaged rain rate derived from gridded radar precipitation data from TOGA COARE. Statistical properties of the data and the model predictions are compared over a wide range of averaging scales. An extension of the spectral model scaling relations to describe the dependence of the average fraction of grid boxes within an area containing nonzero rain (the "rainy area fraction") on the grid scale L is also explored.

  10. 2015 Workplace and Gender Relations Survey of Reserve Component Members: Statistical Methodology Report

    DTIC Science & Technology

    2016-03-17

    RESERVE COMPONENT MEMBERS: STATISTICAL METHODOLOGY REPORT Defense Research, Surveys, and Statistics Center (RSSC) Defense Manpower Data Center...Defense Manpower Data Center (DMDC) is indebted to numerous people for their assistance with the 2015 Workplace and Gender Relations Survey of Reserve...outcomes were modeled as a function of an extensive set of administrative variables available for both respondents and nonrespondents, resulting in six

  11. The l z ( p ) * Person-Fit Statistic in an Unfolding Model Context.

    PubMed

    Tendeiro, Jorge N

    2017-01-01

    Although person-fit analysis has a long-standing tradition within item response theory, it has been applied in combination with dominance response models almost exclusively. In this article, a popular log likelihood-based parametric person-fit statistic under the framework of the generalized graded unfolding model is used. Results from a simulation study indicate that the person-fit statistic performed relatively well in detecting midpoint response style patterns and not so well in detecting extreme response style patterns.

  12. A smoothed residual based goodness-of-fit statistic for nest-survival models

    Treesearch

    Rodney X. Sturdivant; Jay J. Rotella; Robin E. Russell

    2008-01-01

    Estimating nest success and identifying important factors related to nest-survival rates is an essential goal for many wildlife researchers interested in understanding avian population dynamics. Advances in statistical methods have led to a number of estimation methods and approaches to modeling this problem. Recently developed models allow researchers to include a...

  13. Modeling the spatial distribution of landslide-prone colluvium and shallow groundwater on hillslopes of Seattle, WA

    USGS Publications Warehouse

    Schulz, W.H.; Lidke, D.J.; Godt, J.W.

    2008-01-01

    Landslides in partially saturated colluvium on Seattle, WA, hillslopes have resulted in property damage and human casualties. We developed statistical models of colluvium and shallow-groundwater distributions to aid landslide hazard assessments. The models were developed using a geographic information system, digital geologic maps, digital topography, subsurface exploration results, the groundwater flow modeling software VS2DI and regression analyses. Input to the colluvium model includes slope, distance to a hillslope-crest escarpment, and escarpment slope and height. We developed different statistical relations for thickness of colluvium on four landforms. Groundwater model input includes colluvium basal slope and distance from the Fraser aquifer. This distance was used to estimate hydraulic conductivity based on the assumption that addition of finer-grained material from down-section would result in lower conductivity. Colluvial groundwater is perched so we estimated its saturated thickness. We used VS2DI to establish relations between saturated thickness and the hydraulic conductivity and basal slope of the colluvium. We developed different statistical relations for three groundwater flow regimes. All model results were validated using observational data that were excluded from calibration. Eighty percent of colluvium thickness predictions were within 25% of observed values and 88% of saturated thickness predictions were within 20% of observed values. The models are based on conditions common to many areas, so our method can provide accurate results for similar regions; relations in our statistical models require calibration for new regions. Our results suggest that Seattle landslides occur in native deposits and colluvium, ultimately in response to surface-water erosion of hillstope toes. Regional groundwater conditions do not appear to strongly affect the general distribution of Seattle landslides; historical landslides were equally dispersed within and outside of the area potentially affected by regional groundwater conditions.

  14. Random walk to a nonergodic equilibrium concept

    NASA Astrophysics Data System (ADS)

    Bel, G.; Barkai, E.

    2006-01-01

    Random walk models, such as the trap model, continuous time random walks, and comb models, exhibit weak ergodicity breaking, when the average waiting time is infinite. The open question is, what statistical mechanical theory replaces the canonical Boltzmann-Gibbs theory for such systems? In this paper a nonergodic equilibrium concept is investigated, for a continuous time random walk model in a potential field. In particular we show that in the nonergodic phase the distribution of the occupation time of the particle in a finite region of space approaches U- or W-shaped distributions related to the arcsine law. We show that when conditions of detailed balance are applied, these distributions depend on the partition function of the problem, thus establishing a relation between the nonergodic dynamics and canonical statistical mechanics. In the ergodic phase the distribution function of the occupation times approaches a δ function centered on the value predicted based on standard Boltzmann-Gibbs statistics. The relation of our work to single-molecule experiments is briefly discussed.

  15. Effects of alcohol tax increases on alcohol-related disease mortality in Alaska: time-series analyses from 1976 to 2004.

    PubMed

    Wagenaar, Alexander C; Maldonado-Molina, Mildred M; Wagenaar, Bradley H

    2009-08-01

    We evaluated the effects of tax increases on alcoholic beverages in 1983 and 2002 on alcohol-related disease mortality in Alaska. We used a quasi-experimental design with quarterly measures of mortality from 1976 though 2004, and we included other states for comparison. Our statistical approach combined an autoregressive integrated moving average model with structural parameters in interrupted time-series models. We observed statistically significant reductions in the numbers and rates of deaths caused by alcohol-related disease beginning immediately after the 1983 and 2002 alcohol tax increases in Alaska. In terms of effect size, the reductions were -29% (Cohen's d = -0.57) and -11% (Cohen's d = -0.52) for the 2 tax increases. Statistical tests of temporary-effect models versus long-term-effect models showed little dissipation of the effect over time. Increases in alcohol excise tax rates were associated with immediate and sustained reductions in alcohol-related disease mortality in Alaska. Reductions in mortality occurred after 2 tax increases almost 20 years apart. Taxing alcoholic beverages is an effective public health strategy for reducing the burden of alcohol-related disease.

  16. Different Manhattan project: automatic statistical model generation

    NASA Astrophysics Data System (ADS)

    Yap, Chee Keng; Biermann, Henning; Hertzmann, Aaron; Li, Chen; Meyer, Jon; Pao, Hsing-Kuo; Paxia, Salvatore

    2002-03-01

    We address the automatic generation of large geometric models. This is important in visualization for several reasons. First, many applications need access to large but interesting data models. Second, we often need such data sets with particular characteristics (e.g., urban models, park and recreation landscape). Thus we need the ability to generate models with different parameters. We propose a new approach for generating such models. It is based on a top-down propagation of statistical parameters. We illustrate the method in the generation of a statistical model of Manhattan. But the method is generally applicable in the generation of models of large geographical regions. Our work is related to the literature on generating complex natural scenes (smoke, forests, etc) based on procedural descriptions. The difference in our approach stems from three characteristics: modeling with statistical parameters, integration of ground truth (actual map data), and a library-based approach for texture mapping.

  17. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  18. A question of separation: disentangling tracer bias and gravitational non-linearity with counts-in-cells statistics

    NASA Astrophysics Data System (ADS)

    Uhlemann, C.; Feix, M.; Codis, S.; Pichon, C.; Bernardeau, F.; L'Huillier, B.; Kim, J.; Hong, S. E.; Laigle, C.; Park, C.; Shin, J.; Pogosyan, D.

    2018-02-01

    Starting from a very accurate model for density-in-cells statistics of dark matter based on large deviation theory, a bias model for the tracer density in spheres is formulated. It adopts a mean bias relation based on a quadratic bias model to relate the log-densities of dark matter to those of mass-weighted dark haloes in real and redshift space. The validity of the parametrized bias model is established using a parametrization-independent extraction of the bias function. This average bias model is then combined with the dark matter PDF, neglecting any scatter around it: it nevertheless yields an excellent model for densities-in-cells statistics of mass tracers that is parametrized in terms of the underlying dark matter variance and three bias parameters. The procedure is validated on measurements of both the one- and two-point statistics of subhalo densities in the state-of-the-art Horizon Run 4 simulation showing excellent agreement for measured dark matter variance and bias parameters. Finally, it is demonstrated that this formalism allows for a joint estimation of the non-linear dark matter variance and the bias parameters using solely the statistics of subhaloes. Having verified that galaxy counts in hydrodynamical simulations sampled on a scale of 10 Mpc h-1 closely resemble those of subhaloes, this work provides important steps towards making theoretical predictions for density-in-cells statistics applicable to upcoming galaxy surveys like Euclid or WFIRST.

  19. MULTIVARIATE STATISTICAL MODELS FOR EFFECTS OF PM AND COPOLLUTANTS IN A DAILY TIME SERIES EPIDEMIOLOGY STUDY

    EPA Science Inventory

    Most analyses of daily time series epidemiology data relate mortality or morbidity counts to PM and other air pollutants by means of single-outcome regression models using multiple predictors, without taking into account the complex statistical structure of the predictor variable...

  20. Stochastic modeling of sunshine number data

    NASA Astrophysics Data System (ADS)

    Brabec, Marek; Paulescu, Marius; Badescu, Viorel

    2013-11-01

    In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation of Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.

  1. Stochastic modeling of sunshine number data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brabec, Marek, E-mail: mbrabec@cs.cas.cz; Paulescu, Marius; Badescu, Viorel

    2013-11-13

    In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation ofmore » Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.« less

  2. A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis

    ERIC Educational Resources Information Center

    Gonzalez, Oscar; MacKinnon, David P.

    2018-01-01

    Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to…

  3. 75 FR 16202 - Notice of Issuance of Regulatory Guide

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-31

    ..., Revision 2, ``An Acceptable Model and Related Statistical Methods for the Analysis of Fuel Densification.... Introduction The U.S. Nuclear Regulatory Commission (NRC) is issuing a revision to an existing guide in the... nuclear power reactors. To meet these objectives, the guide describes statistical methods related to...

  4. Traveling front solutions to directed diffusion-limited aggregation, digital search trees, and the Lempel-Ziv data compression algorithm.

    PubMed

    Majumdar, Satya N

    2003-08-01

    We use the traveling front approach to derive exact asymptotic results for the statistics of the number of particles in a class of directed diffusion-limited aggregation models on a Cayley tree. We point out that some aspects of these models are closely connected to two different problems in computer science, namely, the digital search tree problem in data structures and the Lempel-Ziv algorithm for data compression. The statistics of the number of particles studied here is related to the statistics of height in digital search trees which, in turn, is related to the statistics of the length of the longest word formed by the Lempel-Ziv algorithm. Implications of our results to these computer science problems are pointed out.

  5. Traveling front solutions to directed diffusion-limited aggregation, digital search trees, and the Lempel-Ziv data compression algorithm

    NASA Astrophysics Data System (ADS)

    Majumdar, Satya N.

    2003-08-01

    We use the traveling front approach to derive exact asymptotic results for the statistics of the number of particles in a class of directed diffusion-limited aggregation models on a Cayley tree. We point out that some aspects of these models are closely connected to two different problems in computer science, namely, the digital search tree problem in data structures and the Lempel-Ziv algorithm for data compression. The statistics of the number of particles studied here is related to the statistics of height in digital search trees which, in turn, is related to the statistics of the length of the longest word formed by the Lempel-Ziv algorithm. Implications of our results to these computer science problems are pointed out.

  6. A method for automatic feature points extraction of human vertebrae three-dimensional model

    NASA Astrophysics Data System (ADS)

    Wu, Zhen; Wu, Junsheng

    2017-05-01

    A method for automatic extraction of the feature points of the human vertebrae three-dimensional model is presented. Firstly, the statistical model of vertebrae feature points is established based on the results of manual vertebrae feature points extraction. Then anatomical axial analysis of the vertebrae model is performed according to the physiological and morphological characteristics of the vertebrae. Using the axial information obtained from the analysis, a projection relationship between the statistical model and the vertebrae model to be extracted is established. According to the projection relationship, the statistical model is matched with the vertebrae model to get the estimated position of the feature point. Finally, by analyzing the curvature in the spherical neighborhood with the estimated position of feature points, the final position of the feature points is obtained. According to the benchmark result on multiple test models, the mean relative errors of feature point positions are less than 5.98%. At more than half of the positions, the error rate is less than 3% and the minimum mean relative error is 0.19%, which verifies the effectiveness of the method.

  7. The epistemology of mathematical and statistical modeling: a quiet methodological revolution.

    PubMed

    Rodgers, Joseph Lee

    2010-01-01

    A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. Copyrigiht 2009 APA, all rights reserved.

  8. Manifold parametrization of the left ventricle for a statistical modelling of its complete anatomy

    NASA Astrophysics Data System (ADS)

    Gil, D.; Garcia-Barnes, J.; Hernández-Sabate, A.; Marti, E.

    2010-03-01

    Distortion of Left Ventricle (LV) external anatomy is related to some dysfunctions, such as hypertrophy. The architecture of myocardial fibers determines LV electromechanical activation patterns as well as mechanics. Thus, their joined modelling would allow the design of specific interventions (such as peacemaker implantation and LV remodelling) and therapies (such as resynchronization). On one hand, accurate modelling of external anatomy requires either a dense sampling or a continuous infinite dimensional approach, which requires non-Euclidean statistics. On the other hand, computation of fiber models requires statistics on Riemannian spaces. Most approaches compute separate statistical models for external anatomy and fibers architecture. In this work we propose a general mathematical framework based on differential geometry concepts for computing a statistical model including, both, external and fiber anatomy. Our framework provides a continuous approach to external anatomy supporting standard statistics. We also provide a straightforward formula for the computation of the Riemannian fiber statistics. We have applied our methodology to the computation of complete anatomical atlas of canine hearts from diffusion tensor studies. The orientation of fibers over the average external geometry agrees with the segmental description of orientations reported in the literature.

  9. Comparative evaluation of statistical and mechanistic models of Escherichia coli at beaches in southern Lake Michigan

    USGS Publications Warehouse

    Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.

    2016-01-01

    Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.

  10. Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment

    PubMed Central

    Hashim, Mazlan

    2015-01-01

    This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning. PMID:25898919

  11. Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment.

    PubMed

    Shahabi, Himan; Hashim, Mazlan

    2015-04-22

    This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning.

  12. A Model for Investigating Predictive Validity at Highly Selective Institutions.

    ERIC Educational Resources Information Center

    Gross, Alan L.; And Others

    A statistical model for investigating predictive validity at highly selective institutions is described. When the selection ratio is small, one must typically deal with a data set containing relatively large amounts of missing data on both criterion and predictor variables. Standard statistical approaches are based on the strong assumption that…

  13. A Method of Relating General Circulation Model Simulated Climate to the Observed Local Climate. Part I: Seasonal Statistics.

    NASA Astrophysics Data System (ADS)

    Karl, Thomas R.; Wang, Wei-Chyung; Schlesinger, Michael E.; Knight, Richard W.; Portman, David

    1990-10-01

    Important surface observations such as the daily maximum and minimum temperature, daily precipitation, and cloud ceilings often have localized characteristics that are difficult to reproduce with the current resolution and the physical parameterizations in state-of-the-art General Circulation climate Models (GCMs). Many of the difficulties can be partially attributed to mismatches in scale, local topography. regional geography and boundary conditions between models and surface-based observations. Here, we present a method, called climatological projection by model statistics (CPMS), to relate GCM grid-point flee-atmosphere statistics, the predictors, to these important local surface observations. The method can be viewed as a generalization of the model output statistics (MOS) and perfect prog (PP) procedures used in numerical weather prediction (NWP) models. It consists of the application of three statistical methods: 1) principle component analysis (FICA), 2) canonical correlation, and 3) inflated regression analysis. The PCA reduces the redundancy of the predictors The canonical correlation is used to develop simultaneous relationships between linear combinations of the predictors, the canonical variables, and the surface-based observations. Finally, inflated regression is used to relate the important canonical variables to each of the surface-based observed variables.We demonstrate that even an early version of the Oregon State University two-level atmospheric GCM (with prescribed sea surface temperature) produces free-atmosphere statistics than can, when standardized using the model's internal means and variances (the MOS-like version of CPMS), closely approximate the observed local climate. When the model data are standardized by the observed free-atmosphere means and variances (the PP version of CPMS), however, the model does not reproduce the observed surface climate as well. Our results indicate that in the MOS-like version of CPMS the differences between the output of a ten-year GCM control run and the surface-based observations are often smaller than the differences between the observations of two ten-year periods. Such positive results suggest that GCMs may already contain important climatological information that can be used to infer the local climate.

  14. Statistical Modeling of Natural Backgrounds in Hyperspectral LWIR Data

    DTIC Science & Technology

    2016-09-06

    extremely important for studying performance trades. First, we study the validity of this model using real hyperspectral data, and compare the relative...difficult to validate any statistical model created for a target of interest. However, since background measurements are plentiful, it is reasonable to...Golden, S., Less, D., Jin, X., and Rynes, P., “ Modeling and analysis of LWIR signature variability associated with 3d and BRDF effects,” 98400P (May 2016

  15. Statistical Modelling of Temperature and Moisture Uptake of Biochars Exposed to Selected Relative Humidity of Air.

    PubMed

    Bastistella, Luciane; Rousset, Patrick; Aviz, Antonio; Caldeira-Pires, Armando; Humbert, Gilles; Nogueira, Manoel

    2018-02-09

    New experimental techniques, as well as modern variants on known methods, have recently been employed to investigate the fundamental reactions underlying the oxidation of biochar. The purpose of this paper was to experimentally and statistically study how the relative humidity of air, mass, and particle size of four biochars influenced the adsorption of water and the increase in temperature. A random factorial design was employed using the intuitive statistical software Xlstat. A simple linear regression model and an analysis of variance with a pairwise comparison were performed. The experimental study was carried out on the wood of Quercus pubescens , Cyclobalanopsis glauca , Trigonostemon huangmosun , and Bambusa vulgaris , and involved five relative humidity conditions (22, 43, 75, 84, and 90%), two mass samples (0.1 and 1 g), and two particle sizes (powder and piece). Two response variables including water adsorption and temperature increase were analyzed and discussed. The temperature did not increase linearly with the adsorption of water. Temperature was modeled by nine explanatory variables, while water adsorption was modeled by eight. Five variables, including factors and their interactions, were found to be common to the two models. Sample mass and relative humidity influenced the two qualitative variables, while particle size and biochar type only influenced the temperature.

  16. Multiplicative point process as a model of trading activity

    NASA Astrophysics Data System (ADS)

    Gontis, V.; Kaulakys, B.

    2004-11-01

    Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.

  17. Beyond δ : Tailoring marked statistics to reveal modified gravity

    NASA Astrophysics Data System (ADS)

    Valogiannis, Georgios; Bean, Rachel

    2018-01-01

    Models that seek to explain cosmic acceleration through modifications to general relativity (GR) evade stringent Solar System constraints through a restoring, screening mechanism. Down-weighting the high-density, screened regions in favor of the low density, unscreened ones offers the potential to enhance the amount of information carried in such modified gravity models. In this work, we assess the performance of a new "marked" transformation and perform a systematic comparison with the clipping and logarithmic transformations, in the context of Λ CDM and the symmetron and f (R ) modified gravity models. Performance is measured in terms of the fractional boost in the Fisher information and the signal-to-noise ratio (SNR) for these models relative to the statistics derived from the standard density distribution. We find that all three statistics provide improved Fisher boosts over the basic density statistics. The model parameters for the marked and clipped transformation that best enhance signals and the Fisher boosts are determined. We also show that the mark is useful both as a Fourier and real-space transformation; a marked correlation function also enhances the SNR relative to the standard correlation function, and can on mildly nonlinear scales show a significant difference between the Λ CDM and the modified gravity models. Our results demonstrate how a series of simple analytical transformations could dramatically increase the predicted information extracted on deviations from GR, from large-scale surveys, and give the prospect for a much more feasible potential detection.

  18. Statistical Cost Estimation in Higher Education: Some Alternatives.

    ERIC Educational Resources Information Center

    Brinkman, Paul T.; Niwa, Shelley

    Recent developments in econometrics that are relevant to the task of estimating costs in higher education are reviewed. The relative effectiveness of alternative statistical procedures for estimating costs are also tested. Statistical cost estimation involves three basic parts: a model, a data set, and an estimation procedure. Actual data are used…

  19. Lack of quantitative training among early-career ecologists: a survey of the problem and potential solutions

    PubMed Central

    Ezard, Thomas H.G.; Jørgensen, Peter S.; Zimmerman, Naupaka; Chamberlain, Scott; Salguero-Gómez, Roberto; Curran, Timothy J.; Poisot, Timothée

    2014-01-01

    Proficiency in mathematics and statistics is essential to modern ecological science, yet few studies have assessed the level of quantitative training received by ecologists. To do so, we conducted an online survey. The 937 respondents were mostly early-career scientists who studied biology as undergraduates. We found a clear self-perceived lack of quantitative training: 75% were not satisfied with their understanding of mathematical models; 75% felt that the level of mathematics was “too low” in their ecology classes; 90% wanted more mathematics classes for ecologists; and 95% more statistics classes. Respondents thought that 30% of classes in ecology-related degrees should be focused on quantitative disciplines, which is likely higher than for most existing programs. The main suggestion to improve quantitative training was to relate theoretical and statistical modeling to applied ecological problems. Improving quantitative training will require dedicated, quantitative classes for ecology-related degrees that contain good mathematical and statistical practice. PMID:24688862

  20. Statistics on continuous IBD data: Exact distribution evaluation for a pair of full(half)-sibs and a pair of a (great-) grandchild with a (great-) grandparent

    PubMed Central

    Stefanov, Valeri T

    2002-01-01

    Background Pairs of related individuals are widely used in linkage analysis. Most of the tests for linkage analysis are based on statistics associated with identity by descent (IBD) data. The current biotechnology provides data on very densely packed loci, and therefore, it may provide almost continuous IBD data for pairs of closely related individuals. Therefore, the distribution theory for statistics on continuous IBD data is of interest. In particular, distributional results which allow the evaluation of p-values for relevant tests are of importance. Results A technology is provided for numerical evaluation, with any given accuracy, of the cumulative probabilities of some statistics on continuous genome data for pairs of closely related individuals. In the case of a pair of full-sibs, the following statistics are considered: (i) the proportion of genome with 2 (at least 1) haplotypes shared identical-by-descent (IBD) on a chromosomal segment, (ii) the number of distinct pieces (subsegments) of a chromosomal segment, on each of which exactly 2 (at least 1) haplotypes are shared IBD. The natural counterparts of these statistics for the other relationships are also considered. Relevant Maple codes are provided for a rapid evaluation of the cumulative probabilities of such statistics. The genomic continuum model, with Haldane's model for the crossover process, is assumed. Conclusions A technology, together with relevant software codes for its automated implementation, are provided for exact evaluation of the distributions of relevant statistics associated with continuous genome data on closely related individuals. PMID:11996673

  1. Local dependence in random graph models: characterization, properties and statistical inference

    PubMed Central

    Schweinberger, Michael; Handcock, Mark S.

    2015-01-01

    Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142

  2. Risk Estimation for Lung Cancer in Libya: Analysis Based on Standardized Morbidity Ratio, Poisson-Gamma Model, BYM Model and Mixture Model

    PubMed

    Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley

    2017-03-01

    Cancer is the most rapidly spreading disease in the world, especially in developing countries, including Libya. Cancer represents a significant burden on patients, families, and their societies. This disease can be controlled if detected early. Therefore, disease mapping has recently become an important method in the fields of public health research and disease epidemiology. The correct choice of statistical model is a very important step to producing a good map of a disease. Libya was selected to perform this work and to examine its geographical variation in the incidence of lung cancer. The objective of this paper is to estimate the relative risk for lung cancer. Four statistical models to estimate the relative risk for lung cancer and population censuses of the study area for the time period 2006 to 2011 were used in this work. They are initially known as Standardized Morbidity Ratio, which is the most popular statistic, which used in the field of disease mapping, Poisson-gamma model, which is one of the earliest applications of Bayesian methodology, Besag, York and Mollie (BYM) model and Mixture model. As an initial step, this study begins by providing a review of all proposed models, which we then apply to lung cancer data in Libya. Maps, tables and graph, goodness-of-fit (GOF) were used to compare and present the preliminary results. This GOF is common in statistical modelling to compare fitted models. The main general results presented in this study show that the Poisson-gamma model, BYM model, and Mixture model can overcome the problem of the first model (SMR) when there is no observed lung cancer case in certain districts. Results show that the Mixture model is most robust and provides better relative risk estimates across a range of models. Creative Commons Attribution License

  3. Risk Estimation for Lung Cancer in Libya: Analysis Based on Standardized Morbidity Ratio, Poisson-Gamma Model, BYM Model and Mixture Model

    PubMed Central

    Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley

    2017-01-01

    Cancer is the most rapidly spreading disease in the world, especially in developing countries, including Libya. Cancer represents a significant burden on patients, families, and their societies. This disease can be controlled if detected early. Therefore, disease mapping has recently become an important method in the fields of public health research and disease epidemiology. The correct choice of statistical model is a very important step to producing a good map of a disease. Libya was selected to perform this work and to examine its geographical variation in the incidence of lung cancer. The objective of this paper is to estimate the relative risk for lung cancer. Four statistical models to estimate the relative risk for lung cancer and population censuses of the study area for the time period 2006 to 2011 were used in this work. They are initially known as Standardized Morbidity Ratio, which is the most popular statistic, which used in the field of disease mapping, Poisson-gamma model, which is one of the earliest applications of Bayesian methodology, Besag, York and Mollie (BYM) model and Mixture model. As an initial step, this study begins by providing a review of all proposed models, which we then apply to lung cancer data in Libya. Maps, tables and graph, goodness-of-fit (GOF) were used to compare and present the preliminary results. This GOF is common in statistical modelling to compare fitted models. The main general results presented in this study show that the Poisson-gamma model, BYM model, and Mixture model can overcome the problem of the first model (SMR) when there is no observed lung cancer case in certain districts. Results show that the Mixture model is most robust and provides better relative risk estimates across a range of models. PMID:28440974

  4. Importance of regional variation in conservation planning: A rangewide example of the Greater Sage-Grouse

    USGS Publications Warehouse

    Doherty, Kevin E.; Evans, Jeffrey S.; Coates, Peter S.; Juliusson, Lara; Fedy, Bradley C.

    2016-01-01

    We developed rangewide population and habitat models for Greater Sage-Grouse (Centrocercus urophasianus) that account for regional variation in habitat selection and relative densities of birds for use in conservation planning and risk assessments. We developed a probabilistic model of occupied breeding habitat by statistically linking habitat characteristics within 4 miles of an occupied lek using a nonlinear machine learning technique (Random Forests). Habitat characteristics used were quantified in GIS and represent standard abiotic and biotic variables related to sage-grouse biology. Statistical model fit was high (mean correctly classified = 82.0%, range = 75.4–88.0%) as were cross-validation statistics (mean = 80.9%, range = 75.1–85.8%). We also developed a spatially explicit model to quantify the relative density of breeding birds across each Greater Sage-Grouse management zone. The models demonstrate distinct clustering of relative abundance of sage-grouse populations across all management zones. On average, approximately half of the breeding population is predicted to be within 10% of the occupied range. We also found that 80% of sage-grouse populations were contained in 25–34% of the occupied range within each management zone. Our rangewide population and habitat models account for regional variation in habitat selection and the relative densities of birds, and thus, they can serve as a consistent and common currency to assess how sage-grouse habitat and populations overlap with conservation actions or threats over the entire sage-grouse range. We also quantified differences in functional habitat responses and disturbance thresholds across the Western Association of Fish and Wildlife Agencies (WAFWA) management zones using statistical relationships identified during habitat modeling. Even for a species as specialized as Greater Sage-Grouse, our results show that ecological context matters in both the strength of habitat selection (i.e., functional response curves) and response to disturbance.

  5. Rainfall Downscaling Conditional on Upper-air Atmospheric Predictors: Improved Assessment of Rainfall Statistics in a Changing Climate

    NASA Astrophysics Data System (ADS)

    Langousis, Andreas; Mamalakis, Antonis; Deidda, Roberto; Marrocu, Marino

    2015-04-01

    To improve the level skill of Global Climate Models (GCMs) and Regional Climate Models (RCMs) in reproducing the statistics of rainfall at a basin level and at hydrologically relevant temporal scales (e.g. daily), two types of statistical approaches have been suggested. One is the statistical correction of climate model rainfall outputs using historical series of precipitation. The other is the use of stochastic models of rainfall to conditionally simulate precipitation series, based on large-scale atmospheric predictors produced by climate models (e.g. geopotential height, relative vorticity, divergence, mean sea level pressure). The latter approach, usually referred to as statistical rainfall downscaling, aims at reproducing the statistical character of rainfall, while accounting for the effects of large-scale atmospheric circulation (and, therefore, climate forcing) on rainfall statistics. While promising, statistical rainfall downscaling has not attracted much attention in recent years, since the suggested approaches involved complex (i.e. subjective or computationally intense) identification procedures of the local weather, in addition to demonstrating limited success in reproducing several statistical features of rainfall, such as seasonal variations, the distributions of dry and wet spell lengths, the distribution of the mean rainfall intensity inside wet periods, and the distribution of rainfall extremes. In an effort to remedy those shortcomings, Langousis and Kaleris (2014) developed a statistical framework for simulation of daily rainfall intensities conditional on upper air variables, which accurately reproduces the statistical character of rainfall at multiple time-scales. Here, we study the relative performance of: a) quantile-quantile (Q-Q) correction of climate model rainfall products, and b) the statistical downscaling scheme of Langousis and Kaleris (2014), in reproducing the statistical structure of rainfall, as well as rainfall extremes, at a regional level. This is done for an intermediate-sized catchment in Italy, i.e. the Flumendosa catchment, using climate model rainfall and atmospheric data from the ENSEMBLES project (http://ensembleseu.metoffice.com). In doing so, we split the historical rainfall record of mean areal precipitation (MAP) in 15-year calibration and 45-year validation periods, and compare the historical rainfall statistics to those obtained from: a) Q-Q corrected climate model rainfall products, and b) synthetic rainfall series generated by the suggested downscaling scheme. To our knowledge, this is the first time that climate model rainfall and statistically downscaled precipitation are compared to catchment-averaged MAP at a daily resolution. The obtained results are promising, since the proposed downscaling scheme is more accurate and robust in reproducing a number of historical rainfall statistics, independent of the climate model used and the length of the calibration period. This is particularly the case for the yearly rainfall maxima, where direct statistical correction of climate model rainfall outputs shows increased sensitivity to the length of the calibration period and the climate model used. The robustness of the suggested downscaling scheme in modeling rainfall extremes at a daily resolution, is a notable feature that can effectively be used to assess hydrologic risk at a regional level under changing climatic conditions. Acknowledgments The research project is implemented within the framework of the Action «Supporting Postdoctoral Researchers» of the Operational Program "Education and Lifelong Learning" (Action's Beneficiary: General Secretariat for Research and Technology), and is co-financed by the European Social Fund (ESF) and the Greek State. CRS4 highly acknowledges the contribution of the Sardinian regional authorities.

  6. Predicting network modules of cell cycle regulators using relative protein abundance statistics.

    PubMed

    Oguz, Cihan; Watson, Layne T; Baumann, William T; Tyson, John J

    2017-02-28

    Parameter estimation in systems biology is typically done by enforcing experimental observations through an objective function as the parameter space of a model is explored by numerical simulations. Past studies have shown that one usually finds a set of "feasible" parameter vectors that fit the available experimental data equally well, and that these alternative vectors can make different predictions under novel experimental conditions. In this study, we characterize the feasible region of a complex model of the budding yeast cell cycle under a large set of discrete experimental constraints in order to test whether the statistical features of relative protein abundance predictions are influenced by the topology of the cell cycle regulatory network. Using differential evolution, we generate an ensemble of feasible parameter vectors that reproduce the phenotypes (viable or inviable) of wild-type yeast cells and 110 mutant strains. We use this ensemble to predict the phenotypes of 129 mutant strains for which experimental data is not available. We identify 86 novel mutants that are predicted to be viable and then rank the cell cycle proteins in terms of their contributions to cumulative variability of relative protein abundance predictions. Proteins involved in "regulation of cell size" and "regulation of G1/S transition" contribute most to predictive variability, whereas proteins involved in "positive regulation of transcription involved in exit from mitosis," "mitotic spindle assembly checkpoint" and "negative regulation of cyclin-dependent protein kinase by cyclin degradation" contribute the least. These results suggest that the statistics of these predictions may be generating patterns specific to individual network modules (START, S/G2/M, and EXIT). To test this hypothesis, we develop random forest models for predicting the network modules of cell cycle regulators using relative abundance statistics as model inputs. Predictive performance is assessed by the areas under receiver operating characteristics curves (AUC). Our models generate an AUC range of 0.83-0.87 as opposed to randomized models with AUC values around 0.50. By using differential evolution and random forest modeling, we show that the model prediction statistics generate distinct network module-specific patterns within the cell cycle network.

  7. ModelTest Server: a web-based tool for the statistical selection of models of nucleotide substitution online

    PubMed Central

    Posada, David

    2006-01-01

    ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest. The server takes as input a text file with likelihood scores for the set of candidate models. Models can be selected with hierarchical likelihood ratio tests, or with the Akaike or Bayesian information criteria. The output includes several statistics for the assessment of model selection uncertainty, for model averaging or to estimate the relative importance of model parameters. The server can be accessed at . PMID:16845102

  8. Comparing geological and statistical approaches for element selection in sediment tracing research

    NASA Astrophysics Data System (ADS)

    Laceby, J. Patrick; McMahon, Joe; Evrard, Olivier; Olley, Jon

    2015-04-01

    Elevated suspended sediment loads reduce reservoir capacity and significantly increase the cost of operating water treatment infrastructure, making the management of sediment supply to reservoirs of increasingly importance. Sediment fingerprinting techniques can be used to determine the relative contributions of different sources of sediment accumulating in reservoirs. The objective of this research is to compare geological and statistical approaches to element selection for sediment fingerprinting modelling. Time-integrated samplers (n=45) were used to obtain source samples from four major subcatchments flowing into the Baroon Pocket Dam in South East Queensland, Australia. The geochemistry of potential sources were compared to the geochemistry of sediment cores (n=12) sampled in the reservoir. The geochemical approach selected elements for modelling that provided expected, observed and statistical discrimination between sediment sources. Two statistical approaches selected elements for modelling with the Kruskal-Wallis H-test and Discriminatory Function Analysis (DFA). In particular, two different significance levels (0.05 & 0.35) for the DFA were included to investigate the importance of element selection on modelling results. A distribution model determined the relative contributions of different sources to sediment sampled in the Baroon Pocket Dam. Elemental discrimination was expected between one subcatchment (Obi Obi Creek) and the remaining subcatchments (Lexys, Falls and Bridge Creek). Six major elements were expected to provide discrimination. Of these six, only Fe2O3 and SiO2 provided expected, observed and statistical discrimination. Modelling results with this geological approach indicated 36% (+/- 9%) of sediment sampled in the reservoir cores were from mafic-derived sources and 64% (+/- 9%) were from felsic-derived sources. The geological and the first statistical approach (DFA0.05) differed by only 1% (σ 5%) for 5 out of 6 model groupings with only the Lexys Creek modelling results differing significantly (35%). The statistical model with expanded elemental selection (DFA0.35) differed from the geological model by an average of 30% for all 6 models. Elemental selection for sediment fingerprinting therefore has the potential to impact modeling results. Accordingly is important to incorporate both robust geological and statistical approaches when selecting elements for sediment fingerprinting. For the Baroon Pocket Dam, management should focus on reducing the supply of sediments derived from felsic sources in each of the subcatchments.

  9. Using meta-regression models to systematically evaluate data in the published literature: relative contributions of agricultural drift, para-occupational, and residential use exposure pathways to house dust pesticide concentrations

    EPA Science Inventory

    Background: Data reported in the published literature have been used qualitatively to aid exposure assessment activities in epidemiologic studies. Analyzing these data in computational models presents statistical challenges because these data are often reported as summary statist...

  10. Avalanches, loading and finite size effects in 2D amorphous plasticity: results from a finite element model

    NASA Astrophysics Data System (ADS)

    Sandfeld, Stefan; Budrikis, Zoe; Zapperi, Stefano; Fernandez Castellanos, David

    2015-02-01

    Crystalline plasticity is strongly interlinked with dislocation mechanics and nowadays is relatively well understood. Concepts and physical models of plastic deformation in amorphous materials on the other hand—where the concept of linear lattice defects is not applicable—still are lagging behind. We introduce an eigenstrain-based finite element lattice model for simulations of shear band formation and strain avalanches. Our model allows us to study the influence of surfaces and finite size effects on the statistics of avalanches. We find that even with relatively complex loading conditions and open boundary conditions, critical exponents describing avalanche statistics are unchanged, which validates the use of simpler scalar lattice-based models to study these phenomena.

  11. Statistical relations of salt and selenium loads to geospatial characteristics of corresponding subbasins of the Colorado and Gunnison Rivers in Colorado

    USGS Publications Warehouse

    Leib, Kenneth J.; Linard, Joshua I.; Williams, Cory A.

    2012-01-01

    Elevated loads of salt and selenium can impair the quality of water for both anthropogenic and natural uses. Understanding the environmental processes controlling how salt and selenium are introduced to streams is critical to managing and mitigating the effects of elevated loads. Dominant relations between salt and selenium loads and environmental characteristics can be established by using geospatial data. The U.S. Geological Survey, in cooperation with the Bureau of Reclamation, investigated statistical relations between seasonal salt or selenium loads emanating from the Upper Colorado River Basin and geospatial data. Salt and selenium loads measured during the irrigation and nonirrigation seasons were related to geospatial variables for 168 subbasins within the Gunnison and Colorado River Basins. These geospatial variables represented subbasin characteristics of the physical environment, precipitation, geology, land use, and the irrigation network. All subbasin variables with units of area had statistically significant relations with load. The few variables that were not in units of area but were statistically significant helped to identify types of geospatial data that might influence salt and selenium loading. Following a stepwise approach, combinations of these statistically significant variables were used to develop multiple linear regression models. The models can be used to help prioritize areas where salt and selenium control projects might be most effective.

  12. Statistical representation of multiphase flow

    NASA Astrophysics Data System (ADS)

    Subramaniam

    2000-11-01

    The relationship between two common statistical representations of multiphase flow, namely, the single--point Eulerian statistical representation of two--phase flow (D. A. Drew, Ann. Rev. Fluid Mech. (15), 1983), and the Lagrangian statistical representation of a spray using the dropet distribution function (F. A. Williams, Phys. Fluids 1 (6), 1958) is established for spherical dispersed--phase elements. This relationship is based on recent work which relates the droplet distribution function to single--droplet pdfs starting from a Liouville description of a spray (Subramaniam, Phys. Fluids 10 (12), 2000). The Eulerian representation, which is based on a random--field model of the flow, is shown to contain different statistical information from the Lagrangian representation, which is based on a point--process model. The two descriptions are shown to be simply related for spherical, monodisperse elements in statistically homogeneous two--phase flow, whereas such a simple relationship is precluded by the inclusion of polydispersity and statistical inhomogeneity. The common origin of these two representations is traced to a more fundamental statistical representation of a multiphase flow, whose concepts derive from a theory for dense sprays recently proposed by Edwards (Atomization and Sprays 10 (3--5), 2000). The issue of what constitutes a minimally complete statistical representation of a multiphase flow is resolved.

  13. Statistical Mechanics of Prion Diseases

    NASA Astrophysics Data System (ADS)

    Slepoy, A.; Singh, R. R.; Pázmándi, F.; Kulkarni, R. V.; Cox, D. L.

    2001-07-01

    We present a two-dimensional, lattice based, protein-level statistical mechanical model for prion diseases (e.g., mad cow disease) with concomitant prion protein misfolding and aggregation. Our studies lead us to the hypothesis that the observed broad incubation time distribution in epidemiological data reflect fluctuation dominated growth seeded by a few nanometer scale aggregates, while much narrower incubation time distributions for innoculated lab animals arise from statistical self-averaging. We model ``species barriers'' to prion infection and assess a related treatment protocol.

  14. Statistical fluctuations in pedestrian evacuation times and the effect of social contagion

    NASA Astrophysics Data System (ADS)

    Nicolas, Alexandre; Bouzat, Sebastián; Kuperman, Marcelo N.

    2016-08-01

    Mathematical models of pedestrian evacuation and the associated simulation software have become essential tools for the assessment of the safety of public facilities and buildings. While a variety of models is now available, their calibration and test against empirical data are generally restricted to global averaged quantities; the statistics compiled from the time series of individual escapes ("microscopic" statistics) measured in recent experiments are thus overlooked. In the same spirit, much research has primarily focused on the average global evacuation time, whereas the whole distribution of evacuation times over some set of realizations should matter. In the present paper we propose and discuss the validity of a simple relation between this distribution and the microscopic statistics, which is theoretically valid in the absence of correlations. To this purpose, we develop a minimal cellular automaton, with features that afford a semiquantitative reproduction of the experimental microscopic statistics. We then introduce a process of social contagion of impatient behavior in the model and show that the simple relation under test may dramatically fail at high contagion strengths, the latter being responsible for the emergence of strong correlations in the system. We conclude with comments on the potential practical relevance for safety science of calculations based on microscopic statistics.

  15. Statistical physics studies of multilayer adsorption isotherm in food materials and pore size distribution

    NASA Astrophysics Data System (ADS)

    Aouaini, F.; Knani, S.; Ben Yahia, M.; Ben Lamine, A.

    2015-08-01

    Water sorption isotherms of foodstuffs are very important in different areas of food science engineering such as for design, modeling and optimization of many processes. The equilibrium moisture content is an important parameter in models used to predict changes in the moisture content of a product during storage. A formulation of multilayer model with two energy levels was based on statistical physics and theoretical considerations. Thanks to the grand canonical ensemble in statistical physics. Some physicochemical parameters related to the adsorption process were introduced in the analytical model expression. The data tabulated in literature of water adsorption at different temperatures on: chickpea seeds, lentil seeds, potato and on green peppers were described applying the most popular models applied in food science. We also extend the study to the newest proposed model. It is concluded that among studied models the proposed model seems to be the best for description of data in the whole range of relative humidity. By using our model, we were able to determine the thermodynamic functions. The measurement of desorption isotherms, in particular a gas over a solid porous, allows access to the distribution of pore size PSD.

  16. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment.

    PubMed

    Berkes, Pietro; Orbán, Gergo; Lengyel, Máté; Fiser, József

    2011-01-07

    The brain maintains internal models of its environment to interpret sensory inputs and to prepare actions. Although behavioral studies have demonstrated that these internal models are optimally adapted to the statistics of the environment, the neural underpinning of this adaptation is unknown. Using a Bayesian model of sensory cortical processing, we related stimulus-evoked and spontaneous neural activities to inferences and prior expectations in an internal model and predicted that they should match if the model is statistically optimal. To test this prediction, we analyzed visual cortical activity of awake ferrets during development. Similarity between spontaneous and evoked activities increased with age and was specific to responses evoked by natural scenes. This demonstrates the progressive adaptation of internal models to the statistics of natural stimuli at the neural level.

  17. Definitions and Models of Statistical Literacy: A Literature Review

    ERIC Educational Resources Information Center

    Sharma, Sashi

    2017-01-01

    Despite statistical literacy being relatively new in statistics education research, it needs special attention as attempts are being made to enhance the teaching, learning and assessing of this sub-strand. It is important that teachers and researchers are aware of the challenges of teaching this literacy. In this article, the growing importance of…

  18. Sampling methods to the statistical control of the production of blood components.

    PubMed

    Pereira, Paulo; Seghatchian, Jerard; Caldeira, Beatriz; Santos, Paula; Castro, Rosa; Fernandes, Teresa; Xavier, Sandra; de Sousa, Gracinda; de Almeida E Sousa, João Paulo

    2017-12-01

    The control of blood components specifications is a requirement generalized in Europe by the European Commission Directives and in the US by the AABB standards. The use of a statistical process control methodology is recommended in the related literature, including the EDQM guideline. The control reliability is dependent of the sampling. However, a correct sampling methodology seems not to be systematically applied. Commonly, the sampling is intended to comply uniquely with the 1% specification to the produced blood components. Nevertheless, on a purely statistical viewpoint, this model could be argued not to be related to a consistent sampling technique. This could be a severe limitation to detect abnormal patterns and to assure that the production has a non-significant probability of producing nonconforming components. This article discusses what is happening in blood establishments. Three statistical methodologies are proposed: simple random sampling, sampling based on the proportion of a finite population, and sampling based on the inspection level. The empirical results demonstrate that these models are practicable in blood establishments contributing to the robustness of sampling and related statistical process control decisions for the purpose they are suggested for. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Improving the Document Development Process: Integrating Relational Data and Statistical Process Control.

    ERIC Educational Resources Information Center

    Miller, John

    1994-01-01

    Presents an approach to document numbering, document titling, and process measurement which, when used with fundamental techniques of statistical process control, reveals meaningful process-element variation as well as nominal productivity models. (SR)

  20. Relational event models for longitudinal network data with an application to interhospital patient transfers.

    PubMed

    Vu, Duy; Lomi, Alessandro; Mascia, Daniele; Pallotti, Francesca

    2017-06-30

    The main objective of this paper is to introduce and illustrate relational event models, a new class of statistical models for the analysis of time-stamped data with complex temporal and relational dependencies. We outline the main differences between recently proposed relational event models and more conventional network models based on the graph-theoretic formalism typically adopted in empirical studies of social networks. Our main contribution involves the definition and implementation of a marked point process extension of currently available models. According to this approach, the sequence of events of interest is decomposed into two components: (a) event time and (b) event destination. This decomposition transforms the problem of selection of event destination in relational event models into a conditional multinomial logistic regression problem. The main advantages of this formulation are the possibility of controlling for the effect of event-specific data and a significant reduction in the estimation time of currently available relational event models. We demonstrate the empirical value of the model in an analysis of interhospital patient transfers within a regional community of health care organizations. We conclude with a discussion of how the models we presented help to overcome some the limitations of statistical models for networks that are currently available. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Modeling Count Outcomes from HIV Risk Reduction Interventions: A Comparison of Competing Statistical Models for Count Responses

    PubMed Central

    Xia, Yinglin; Morrison-Beedy, Dianne; Ma, Jingming; Feng, Changyong; Cross, Wendi; Tu, Xin

    2012-01-01

    Modeling count data from sexual behavioral outcomes involves many challenges, especially when the data exhibit a preponderance of zeros and overdispersion. In particular, the popular Poisson log-linear model is not appropriate for modeling such outcomes. Although alternatives exist for addressing both issues, they are not widely and effectively used in sex health research, especially in HIV prevention intervention and related studies. In this paper, we discuss how to analyze count outcomes distributed with excess of zeros and overdispersion and introduce appropriate model-fit indices for comparing the performance of competing models, using data from a real study on HIV prevention intervention. The in-depth look at these common issues arising from studies involving behavioral outcomes will promote sound statistical analyses and facilitate research in this and other related areas. PMID:22536496

  2. Inferring general relations between network characteristics from specific network ensembles.

    PubMed

    Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan

    2012-01-01

    Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.

  3. Ethnicity, Effort, Self-Efficacy, Worry, and Statistics Achievement in Malaysia: A Construct Validation of the State-Trait Motivation Model

    ERIC Educational Resources Information Center

    Awang-Hashim, Rosa; O'Neil, Harold F., Jr.; Hocevar, Dennis

    2002-01-01

    The relations between motivational constructs, effort, self-efficacy and worry, and statistics achievement were investigated in a sample of 360 undergraduates in Malaysia. Both trait (cross-situational) and state (task-specific) measures of each construct were used to test a mediational trait (r) state (r) performance (TSP) model. As hypothesized,…

  4. Comparisons between physics-based, engineering, and statistical learning models for outdoor sound propagation.

    PubMed

    Hart, Carl R; Reznicek, Nathan J; Wilson, D Keith; Pettit, Chris L; Nykaza, Edward T

    2016-05-01

    Many outdoor sound propagation models exist, ranging from highly complex physics-based simulations to simplified engineering calculations, and more recently, highly flexible statistical learning methods. Several engineering and statistical learning models are evaluated by using a particular physics-based model, namely, a Crank-Nicholson parabolic equation (CNPE), as a benchmark. Narrowband transmission loss values predicted with the CNPE, based upon a simulated data set of meteorological, boundary, and source conditions, act as simulated observations. In the simulated data set sound propagation conditions span from downward refracting to upward refracting, for acoustically hard and soft boundaries, and low frequencies. Engineering models used in the comparisons include the ISO 9613-2 method, Harmonoise, and Nord2000 propagation models. Statistical learning methods used in the comparisons include bagged decision tree regression, random forest regression, boosting regression, and artificial neural network models. Computed skill scores are relative to sound propagation in a homogeneous atmosphere over a rigid ground. Overall skill scores for the engineering noise models are 0.6%, -7.1%, and 83.8% for the ISO 9613-2, Harmonoise, and Nord2000 models, respectively. Overall skill scores for the statistical learning models are 99.5%, 99.5%, 99.6%, and 99.6% for bagged decision tree, random forest, boosting, and artificial neural network regression models, respectively.

  5. Predictive data modeling of human type II diabetes related statistics

    NASA Astrophysics Data System (ADS)

    Jaenisch, Kristina L.; Jaenisch, Holger M.; Handley, James W.; Albritton, Nathaniel G.

    2009-04-01

    During the course of routine Type II treatment of one of the authors, it was decided to derive predictive analytical Data Models of the daily sampled vital statistics: namely weight, blood pressure, and blood sugar, to determine if the covariance among the observed variables could yield a descriptive equation based model, or better still, a predictive analytical model that could forecast the expected future trend of the variables and possibly eliminate the number of finger stickings required to montior blood sugar levels. The personal history and analysis with resulting models are presented.

  6. Population activity statistics dissect subthreshold and spiking variability in V1.

    PubMed

    Bányai, Mihály; Koman, Zsombor; Orbán, Gergő

    2017-07-01

    Response variability, as measured by fluctuating responses upon repeated performance of trials, is a major component of neural responses, and its characterization is key to interpret high dimensional population recordings. Response variability and covariability display predictable changes upon changes in stimulus and cognitive or behavioral state, providing an opportunity to test the predictive power of models of neural variability. Still, there is little agreement on which model to use as a building block for population-level analyses, and models of variability are often treated as a subject of choice. We investigate two competing models, the doubly stochastic Poisson (DSP) model assuming stochasticity at spike generation, and the rectified Gaussian (RG) model tracing variability back to membrane potential variance, to analyze stimulus-dependent modulation of both single-neuron and pairwise response statistics. Using a pair of model neurons, we demonstrate that the two models predict similar single-cell statistics. However, DSP and RG models have contradicting predictions on the joint statistics of spiking responses. To test the models against data, we build a population model to simulate stimulus change-related modulations in pairwise response statistics. We use single-unit data from the primary visual cortex (V1) of monkeys to show that while model predictions for variance are qualitatively similar to experimental data, only the RG model's predictions are compatible with joint statistics. These results suggest that models using Poisson-like variability might fail to capture important properties of response statistics. We argue that membrane potential-level modeling of stochasticity provides an efficient strategy to model correlations. NEW & NOTEWORTHY Neural variability and covariability are puzzling aspects of cortical computations. For efficient decoding and prediction, models of information encoding in neural populations hinge on an appropriate model of variability. Our work shows that stimulus-dependent changes in pairwise but not in single-cell statistics can differentiate between two widely used models of neuronal variability. Contrasting model predictions with neuronal data provides hints on the noise sources in spiking and provides constraints on statistical models of population activity. Copyright © 2017 the American Physiological Society.

  7. Detecting changes in dynamic and complex acoustic environments

    PubMed Central

    Boubenec, Yves; Lawlor, Jennifer; Górska, Urszula; Shamma, Shihab; Englitz, Bernhard

    2017-01-01

    Natural sounds such as wind or rain, are characterized by the statistical occurrence of their constituents. Despite their complexity, listeners readily detect changes in these contexts. We here address the neural basis of statistical decision-making using a combination of psychophysics, EEG and modelling. In a texture-based, change-detection paradigm, human performance and reaction times improved with longer pre-change exposure, consistent with improved estimation of baseline statistics. Change-locked and decision-related EEG responses were found in a centro-parietal scalp location, whose slope depended on change size, consistent with sensory evidence accumulation. The potential's amplitude scaled with the duration of pre-change exposure, suggesting a time-dependent decision threshold. Auditory cortex-related potentials showed no response to the change. A dual timescale, statistical estimation model accounted for subjects' performance. Furthermore, a decision-augmented auditory cortex model accounted for performance and reaction times, suggesting that the primary cortical representation requires little post-processing to enable change-detection in complex acoustic environments. DOI: http://dx.doi.org/10.7554/eLife.24910.001 PMID:28262095

  8. Statistical modeling of natural backgrounds in hyperspectral LWIR data

    NASA Astrophysics Data System (ADS)

    Truslow, Eric; Manolakis, Dimitris; Cooley, Thomas; Meola, Joseph

    2016-09-01

    Hyperspectral sensors operating in the long wave infrared (LWIR) have a wealth of applications including remote material identification and rare target detection. While statistical models for modeling surface reflectance in visible and near-infrared regimes have been well studied, models for the temperature and emissivity in the LWIR have not been rigorously investigated. In this paper, we investigate modeling hyperspectral LWIR data using a statistical mixture model for the emissivity and surface temperature. Statistical models for the surface parameters can be used to simulate surface radiances and at-sensor radiance which drives the variability of measured radiance and ultimately the performance of signal processing algorithms. Thus, having models that adequately capture data variation is extremely important for studying performance trades. The purpose of this paper is twofold. First, we study the validity of this model using real hyperspectral data, and compare the relative variability of hyperspectral data in the LWIR and visible and near-infrared (VNIR) regimes. Second, we illustrate how materials that are easily distinguished in the VNIR, may be difficult to separate when imaged in the LWIR.

  9. Fully Bayesian inference for structural MRI: application to segmentation and statistical analysis of T2-hypointensities.

    PubMed

    Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark

    2013-01-01

    Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.

  10. Empirical Reference Distributions for Networks of Different Size

    PubMed Central

    Smith, Anna; Calder, Catherine A.; Browning, Christopher R.

    2016-01-01

    Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556

  11. Tracing the source of numerical climate model uncertainties in precipitation simulations using a feature-oriented statistical model

    NASA Astrophysics Data System (ADS)

    Xu, Y.; Jones, A. D.; Rhoades, A.

    2017-12-01

    Precipitation is a key component in hydrologic cycles, and changing precipitation regimes contribute to more intense and frequent drought and flood events around the world. Numerical climate modeling is a powerful tool to study climatology and to predict future changes. Despite the continuous improvement in numerical models, long-term precipitation prediction remains a challenge especially at regional scales. To improve numerical simulations of precipitation, it is important to find out where the uncertainty in precipitation simulations comes from. There are two types of uncertainty in numerical model predictions. One is related to uncertainty in the input data, such as model's boundary and initial conditions. These uncertainties would propagate to the final model outcomes even if the numerical model has exactly replicated the true world. But a numerical model cannot exactly replicate the true world. Therefore, the other type of model uncertainty is related the errors in the model physics, such as the parameterization of sub-grid scale processes, i.e., given precise input conditions, how much error could be generated by the in-precise model. Here, we build two statistical models based on a neural network algorithm to predict long-term variation of precipitation over California: one uses "true world" information derived from observations, and the other uses "modeled world" information using model inputs and outputs from the North America Coordinated Regional Downscaling Project (NA CORDEX). We derive multiple climate feature metrics as the predictors for the statistical model to represent the impact of global climate on local hydrology, and include topography as a predictor to represent the local control. We first compare the predictors between the true world and the modeled world to determine the errors contained in the input data. By perturbing the predictors in the statistical model, we estimate how much uncertainty in the model's final outcomes is accounted for by each predictor. By comparing the statistical model derived from true world information and modeled world information, we assess the errors lying in the physics of the numerical models. This work provides a unique insight to assess the performance of numerical climate models, and can be used to guide improvement of precipitation prediction.

  12. Loop Braiding Statistics and Interacting Fermionic Symmetry-Protected Topological Phases in Three Dimensions

    NASA Astrophysics Data System (ADS)

    Cheng, Meng; Tantivasadakarn, Nathanan; Wang, Chenjie

    2018-01-01

    We study Abelian braiding statistics of loop excitations in three-dimensional gauge theories with fermionic particles and the closely related problem of classifying 3D fermionic symmetry-protected topological (FSPT) phases with unitary symmetries. It is known that the two problems are related by turning FSPT phases into gauge theories through gauging the global symmetry of the former. We show that there exist certain types of Abelian loop braiding statistics that are allowed only in the presence of fermionic particles, which correspond to 3D "intrinsic" FSPT phases, i.e., those that do not stem from bosonic SPT phases. While such intrinsic FSPT phases are ubiquitous in 2D systems and in 3D systems with antiunitary symmetries, their existence in 3D systems with unitary symmetries was not confirmed previously due to the fact that strong interaction is necessary to realize them. We show that the simplest unitary symmetry to support 3D intrinsic FSPT phases is Z2×Z4. To establish the results, we first derive a complete set of physical constraints on Abelian loop braiding statistics. Solving the constraints, we obtain all possible Abelian loop braiding statistics in 3D gauge theories, including those that correspond to intrinsic FSPT phases. Then, we construct exactly soluble state-sum models to realize the loop braiding statistics. These state-sum models generalize the well-known Crane-Yetter and Dijkgraaf-Witten models.

  13. Predicting lettuce canopy photosynthesis with statistical and neural network models

    NASA Technical Reports Server (NTRS)

    Frick, J.; Precetti, C.; Mitchell, C. A.

    1998-01-01

    An artificial neural network (NN) and a statistical regression model were developed to predict canopy photosynthetic rates (Pn) for 'Waldman's Green' leaf lettuce (Latuca sativa L.). All data used to develop and test the models were collected for crop stands grown hydroponically and under controlled-environment conditions. In the NN and regression models, canopy Pn was predicted as a function of three independent variables: shootzone CO2 concentration (600 to 1500 micromoles mol-1), photosynthetic photon flux (PPF) (600 to 1100 micromoles m-2 s-1), and canopy age (10 to 20 days after planting). The models were used to determine the combinations of CO2 and PPF setpoints required each day to maintain maximum canopy Pn. The statistical model (a third-order polynomial) predicted Pn more accurately than the simple NN (a three-layer, fully connected net). Over an 11-day validation period, average percent difference between predicted and actual Pn was 12.3% and 24.6% for the statistical and NN models, respectively. Both models lost considerable accuracy when used to determine relatively long-range Pn predictions (> or = 6 days into the future).

  14. Probabilistic Modeling and Visualization of the Flexibility in Morphable Models

    NASA Astrophysics Data System (ADS)

    Lüthi, M.; Albrecht, T.; Vetter, T.

    Statistical shape models, and in particular morphable models, have gained widespread use in computer vision, computer graphics and medical imaging. Researchers have started to build models of almost any anatomical structure in the human body. While these models provide a useful prior for many image analysis task, relatively little information about the shape represented by the morphable model is exploited. We propose a method for computing and visualizing the remaining flexibility, when a part of the shape is fixed. Our method, which is based on Probabilistic PCA, not only leads to an approach for reconstructing the full shape from partial information, but also allows us to investigate and visualize the uncertainty of a reconstruction. To show the feasibility of our approach we performed experiments on a statistical model of the human face and the femur bone. The visualization of the remaining flexibility allows for greater insight into the statistical properties of the shape.

  15. SU-E-T-503: IMRT Optimization Using Monte Carlo Dose Engine: The Effect of Statistical Uncertainty.

    PubMed

    Tian, Z; Jia, X; Graves, Y; Uribe-Sanchez, A; Jiang, S

    2012-06-01

    With the development of ultra-fast GPU-based Monte Carlo (MC) dose engine, it becomes clinically realistic to compute the dose-deposition coefficients (DDC) for IMRT optimization using MC simulation. However, it is still time-consuming if we want to compute DDC with small statistical uncertainty. This work studies the effects of the statistical error in DDC matrix on IMRT optimization. The MC-computed DDC matrices are simulated here by adding statistical uncertainties at a desired level to the ones generated with a finite-size pencil beam algorithm. A statistical uncertainty model for MC dose calculation is employed. We adopt a penalty-based quadratic optimization model and gradient descent method to optimize fluence map and then recalculate the corresponding actual dose distribution using the noise-free DDC matrix. The impacts of DDC noise are assessed in terms of the deviation of the resulted dose distributions. We have also used a stochastic perturbation theory to theoretically estimate the statistical errors of dose distributions on a simplified optimization model. A head-and-neck case is used to investigate the perturbation to IMRT plan due to MC's statistical uncertainty. The relative errors of the final dose distributions of the optimized IMRT are found to be much smaller than those in the DDC matrix, which is consistent with our theoretical estimation. When history number is decreased from 108 to 106, the dose-volume-histograms are still very similar to the error-free DVHs while the error in DDC is about 3.8%. The results illustrate that the statistical errors in the DDC matrix have a relatively small effect on IMRT optimization in dose domain. This indicates we can use relatively small number of histories to obtain the DDC matrix with MC simulation within a reasonable amount of time, without considerably compromising the accuracy of the optimized treatment plan. This work is supported by Varian Medical Systems through a Master Research Agreement. © 2012 American Association of Physicists in Medicine.

  16. Assessment of credit risk based on fuzzy relations

    NASA Astrophysics Data System (ADS)

    Tsabadze, Teimuraz

    2017-06-01

    The purpose of this paper is to develop a new approach for an assessment of the credit risk to corporate borrowers. There are different models for borrowers' risk assessment. These models are divided into two groups: statistical and theoretical. When assessing the credit risk for corporate borrowers, statistical model is unacceptable due to the lack of sufficiently large history of defaults. At the same time, we cannot use some theoretical models due to the lack of stock exchange. In those cases, when studying a particular borrower given that statistical base does not exist, the decision-making process is always of expert nature. The paper describes a new approach that may be used in group decision-making. An example of the application of the proposed approach is given.

  17. North American Extreme Temperature Events and Related Large Scale Meteorological Patterns: A Review of Statistical Methods, Dynamics, Modeling, and Trends

    NASA Technical Reports Server (NTRS)

    Grotjahn, Richard; Black, Robert; Leung, Ruby; Wehner, Michael F.; Barlow, Mathew; Bosilovich, Michael G.; Gershunov, Alexander; Gutowski, William J., Jr.; Gyakum, John R.; Katz, Richard W.; hide

    2015-01-01

    The objective of this paper is to review statistical methods, dynamics, modeling efforts, and trends related to temperature extremes, with a focus upon extreme events of short duration that affect parts of North America. These events are associated with large scale meteorological patterns (LSMPs). The statistics, dynamics, and modeling sections of this paper are written to be autonomous and so can be read separately. Methods to define extreme events statistics and to identify and connect LSMPs to extreme temperature events are presented. Recent advances in statistical techniques connect LSMPs to extreme temperatures through appropriately defined covariates that supplement more straightforward analyses. Various LSMPs, ranging from synoptic to planetary scale structures, are associated with extreme temperature events. Current knowledge about the synoptics and the dynamical mechanisms leading to the associated LSMPs is incomplete. Systematic studies of: the physics of LSMP life cycles, comprehensive model assessment of LSMP-extreme temperature event linkages, and LSMP properties are needed. Generally, climate models capture observed properties of heat waves and cold air outbreaks with some fidelity. However they overestimate warm wave frequency and underestimate cold air outbreak frequency, and underestimate the collective influence of low-frequency modes on temperature extremes. Modeling studies have identified the impact of large-scale circulation anomalies and landatmosphere interactions on changes in extreme temperatures. However, few studies have examined changes in LSMPs to more specifically understand the role of LSMPs on past and future extreme temperature changes. Even though LSMPs are resolvable by global and regional climate models, they are not necessarily well simulated. The paper concludes with unresolved issues and research questions.

  18. Addressing Economic Development Goals through Innovative Teaching of University Statistics: A Case Study of Statistical Modelling in Nigeria

    ERIC Educational Resources Information Center

    Ezepue, Patrick Oseloka; Ojo, Adegbola

    2012-01-01

    A challenging problem in some developing countries such as Nigeria is inadequate training of students in effective problem solving using the core concepts of their disciplines. Related to this is a disconnection between their learning and socio-economic development agenda of a country. These problems are more vivid in statistical education which…

  19. Condensate statistics and thermodynamics of weakly interacting Bose gas: Recursion relation approach

    NASA Astrophysics Data System (ADS)

    Dorfman, K. E.; Kim, M.; Svidzinsky, A. A.

    2011-03-01

    We study condensate statistics and thermodynamics of weakly interacting Bose gas with a fixed total number N of particles in a cubic box. We find the exact recursion relation for the canonical ensemble partition function. Using this relation, we calculate the distribution function of condensate particles for N=200. We also calculate the distribution function based on multinomial expansion of the characteristic function. Similar to the ideal gas, both approaches give exact statistical moments for all temperatures in the framework of Bogoliubov model. We compare them with the results of unconstraint canonical ensemble quasiparticle formalism and the hybrid master equation approach. The present recursion relation can be used for any external potential and boundary conditions. We investigate the temperature dependence of the first few statistical moments of condensate fluctuations as well as thermodynamic potentials and heat capacity analytically and numerically in the whole temperature range.

  20. Modeling Cell Size Regulation: From Single-Cell-Level Statistics to Molecular Mechanisms and Population-Level Effects.

    PubMed

    Ho, Po-Yi; Lin, Jie; Amir, Ariel

    2018-05-20

    Most microorganisms regulate their cell size. In this article, we review some of the mathematical formulations of the problem of cell size regulation. We focus on coarse-grained stochastic models and the statistics that they generate. We review the biologically relevant insights obtained from these models. We then describe cell cycle regulation and its molecular implementations, protein number regulation, and population growth, all in relation to size regulation. Finally, we discuss several future directions for developing understanding beyond phenomenological models of cell size regulation.

  1. Peer Review Documents Related to the Evaluation of ...

    EPA Pesticide Factsheets

    BMDS is one of the Agency's premier tools for estimating risk assessments, therefore the validity and reliability of its statistical models are of paramount importance. This page provides links to peer review and expert summaries of the BMDS application and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling. This page provides links to peer reviews and expert summaries of the BMDS applications and its models as they were developed and eventually released.

  2. Applications of spatial statistical network models to stream data

    USGS Publications Warehouse

    Isaak, Daniel J.; Peterson, Erin E.; Ver Hoef, Jay M.; Wenger, Seth J.; Falke, Jeffrey A.; Torgersen, Christian E.; Sowder, Colin; Steel, E. Ashley; Fortin, Marie-Josée; Jordan, Chris E.; Ruesch, Aaron S.; Som, Nicholas; Monestiez, Pascal

    2014-01-01

    Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for terrestrial applications and are not optimized for streams. A new class of spatial statistical model, based on valid covariance structures for stream networks, can be used with many common types of stream data (e.g., water quality attributes, habitat conditions, biological surveys) through application of appropriate distributions (e.g., Gaussian, binomial, Poisson). The spatial statistical network models account for spatial autocorrelation (i.e., nonindependence) among measurements, which allows their application to databases with clustered measurement locations. Large amounts of stream data exist in many areas where spatial statistical analyses could be used to develop novel insights, improve predictions at unsampled sites, and aid in the design of efficient monitoring strategies at relatively low cost. We review the topic of spatial autocorrelation and its effects on statistical inference, demonstrate the use of spatial statistics with stream datasets relevant to common research and management questions, and discuss additional applications and development potential for spatial statistics on stream networks. Free software for implementing the spatial statistical network models has been developed that enables custom applications with many stream databases.

  3. The use of imputed sibling genotypes in sibship-based association analysis: on modeling alternatives, power and model misspecification.

    PubMed

    Minică, Camelia C; Dolan, Conor V; Hottenga, Jouke-Jan; Willemsen, Gonneke; Vink, Jacqueline M; Boomsma, Dorret I

    2013-05-01

    When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.

  4. The epistemological status of general circulation models

    NASA Astrophysics Data System (ADS)

    Loehle, Craig

    2018-03-01

    Forecasts of both likely anthropogenic effects on climate and consequent effects on nature and society are based on large, complex software tools called general circulation models (GCMs). Forecasts generated by GCMs have been used extensively in policy decisions related to climate change. However, the relation between underlying physical theories and results produced by GCMs is unclear. In the case of GCMs, many discretizations and approximations are made, and simulating Earth system processes is far from simple and currently leads to some results with unknown energy balance implications. Statistical testing of GCM forecasts for degree of agreement with data would facilitate assessment of fitness for use. If model results need to be put on an anomaly basis due to model bias, then both visual and quantitative measures of model fit depend strongly on the reference period used for normalization, making testing problematic. Epistemology is here applied to problems of statistical inference during testing, the relationship between the underlying physics and the models, the epistemic meaning of ensemble statistics, problems of spatial and temporal scale, the existence or not of an unforced null for climate fluctuations, the meaning of existing uncertainty estimates, and other issues. Rigorous reasoning entails carefully quantifying levels of uncertainty.

  5. Patch-Based Generative Shape Model and MDL Model Selection for Statistical Analysis of Archipelagos

    NASA Astrophysics Data System (ADS)

    Ganz, Melanie; Nielsen, Mads; Brandt, Sami

    We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning a patch-based dictionary for possible shapes, (2) building up a time-homogeneous Markov model to model the neighbourhood correlations between the patches, and (3) automatic selection of the model complexity by the minimum description length principle. The generative shape model is proposed as a probability distribution of a binary image where the model is intended to facilitate sequential simulation. Our results show that a relatively simple model is able to generate structures visually similar to calcifications. Furthermore, we used the shape model as a shape prior in the statistical segmentation of calcifications, where the area overlap with the ground truth shapes improved significantly compared to the case where the prior was not used.

  6. A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. Part 2: Theoretical development of a dynamic model and application to rain fade durations and tolerable control delays for fade countermeasures

    NASA Technical Reports Server (NTRS)

    Manning, Robert M.

    1987-01-01

    A dynamic rain attenuation prediction model is developed for use in obtaining the temporal characteristics, on time scales of minutes or hours, of satellite communication link availability. Analagous to the associated static rain attenuation model, which yields yearly attenuation predictions, this dynamic model is applicable at any location in the world that is characterized by the static rain attenuation statistics peculiar to the geometry of the satellite link and the rain statistics of the location. Such statistics are calculated by employing the formalism of Part I of this report. In fact, the dynamic model presented here is an extension of the static model and reduces to the static model in the appropriate limit. By assuming that rain attenuation is dynamically described by a first-order stochastic differential equation in time and that this random attenuation process is a Markov process, an expression for the associated transition probability is obtained by solving the related forward Kolmogorov equation. This transition probability is then used to obtain such temporal rain attenuation statistics as attenuation durations and allowable attenuation margins versus control system delay.

  7. A Statistical-Physics Approach to Language Acquisition and Language Change

    NASA Astrophysics Data System (ADS)

    Cassandro, Marzio; Collet, Pierre; Galves, Antonio; Galves, Charlotte

    1999-02-01

    The aim of this paper is to explain why Statistical Physics can help understanding two related linguistic questions. The first question is how to model first language acquisition by a child. The second question is how language change proceeds in time. Our approach is based on a Gibbsian model for the interface between syntax and prosody. We also present a simulated annealing model of language acquisition, which extends the Triggering Learning Algorithm recently introduced in the linguistic literature.

  8. Applying Statistical Models and Parametric Distance Measures for Music Similarity Search

    NASA Astrophysics Data System (ADS)

    Lukashevich, Hanna; Dittmar, Christian; Bastuck, Christoph

    Automatic deriving of similarity relations between music pieces is an inherent field of music information retrieval research. Due to the nearly unrestricted amount of musical data, the real-world similarity search algorithms have to be highly efficient and scalable. The possible solution is to represent each music excerpt with a statistical model (ex. Gaussian mixture model) and thus to reduce the computational costs by applying the parametric distance measures between the models. In this paper we discuss the combinations of applying different parametric modelling techniques and distance measures and weigh the benefits of each one against the others.

  9. Statistical inference, the bootstrap, and neural-network modeling with application to foreign exchange rates.

    PubMed

    White, H; Racine, J

    2001-01-01

    We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.

  10. Occupation times and ergodicity breaking in biased continuous time random walks

    NASA Astrophysics Data System (ADS)

    Bel, Golan; Barkai, Eli

    2005-12-01

    Continuous time random walk (CTRW) models are widely used to model diffusion in condensed matter. There are two classes of such models, distinguished by the convergence or divergence of the mean waiting time. Systems with finite average sojourn time are ergodic and thus Boltzmann-Gibbs statistics can be applied. We investigate the statistical properties of CTRW models with infinite average sojourn time; in particular, the occupation time probability density function is obtained. It is shown that in the non-ergodic phase the distribution of the occupation time of the particle on a given lattice point exhibits bimodal U or trimodal W shape, related to the arcsine law. The key points are as follows. (a) In a CTRW with finite or infinite mean waiting time, the distribution of the number of visits on a lattice point is determined by the probability that a member of an ensemble of particles in equilibrium occupies the lattice point. (b) The asymmetry parameter of the probability distribution function of occupation times is related to the Boltzmann probability and to the partition function. (c) The ensemble average is given by Boltzmann-Gibbs statistics for either finite or infinite mean sojourn time, when detailed balance conditions hold. (d) A non-ergodic generalization of the Boltzmann-Gibbs statistical mechanics for systems with infinite mean sojourn time is found.

  11. Chern-Simons Term: Theory and Applications.

    NASA Astrophysics Data System (ADS)

    Gupta, Kumar Sankar

    1992-01-01

    We investigate the quantization and applications of Chern-Simons theories to several systems of interest. Elementary canonical methods are employed for the quantization of abelian and nonabelian Chern-Simons actions using ideas from gauge theories and quantum gravity. When the spatial slice is a disc, it yields quantum states at the edge of the disc carrying a representation of the Kac-Moody algebra. We next include sources in this model and their quantum states are shown to be those of a conformal family. Vertex operators for both abelian and nonabelian sources are constructed. The regularized abelian Wilson line is proved to be a vertex operator. The spin-statistics theorem is established for Chern-Simons dynamics using purely geometrical techniques. Chern-Simons action is associated with exotic spin and statistics in 2 + 1 dimensions. We study several systems in which the Chern-Simons action affects the spin and statistics. The first class of systems we study consist of G/H models. The solitons of these models are shown to obey anyonic statistics in the presence of a Chern-Simons term. The second system deals with the effect of the Chern -Simons term in a model for high temperature superconductivity. The coefficient of the Chern-Simons term is shown to be quantized, one of its possible values giving fermionic statistics to the solitons of this model. Finally, we study a system of spinning particles interacting with 2 + 1 gravity, the latter being described by an ISO(2,1) Chern-Simons term. An effective action for the particles is obtained by integrating out the gauge fields. Next we construct operators which exchange the particles. They are shown to satisfy the braid relations. There are ambiguities in the quantization of this system which can be exploited to give anyonic statistics to the particles. We also point out that at the level of the first quantized theory, the usual spin-statistics relation need not apply to these particles.

  12. Computer-aided auditing of prescription drug claims.

    PubMed

    Iyengar, Vijay S; Hermiz, Keith B; Natarajan, Ramesh

    2014-09-01

    We describe a methodology for identifying and ranking candidate audit targets from a database of prescription drug claims. The relevant audit targets may include various entities such as prescribers, patients and pharmacies, who exhibit certain statistical behavior indicative of potential fraud and abuse over the prescription claims during a specified period of interest. Our overall approach is consistent with related work in statistical methods for detection of fraud and abuse, but has a relative emphasis on three specific aspects: first, based on the assessment of domain experts, certain focus areas are selected and data elements pertinent to the audit analysis in each focus area are identified; second, specialized statistical models are developed to characterize the normalized baseline behavior in each focus area; and third, statistical hypothesis testing is used to identify entities that diverge significantly from their expected behavior according to the relevant baseline model. The application of this overall methodology to a prescription claims database from a large health plan is considered in detail.

  13. Partial Coordination Numbers in Binary Metallic Glasses (Postprint)

    DTIC Science & Technology

    2011-12-07

    structural differences related to relative atom size and quench rate. The magnitude of chemical interactions between the atoms, eij, might also influence...vious calculations.[2] A statistical approach is used to develop the Zij equations from the product of four terms: (1) the number of reference sites...within experimental scatter. The development of equations for Zij from the ECP model uses a statistical view of topology, and the Zij values

  14. Modeling Complex Phenomena Using Multiscale Time Sequences

    DTIC Science & Technology

    2009-08-24

    measures based on Hurst and Holder exponents , auto-regressive methods and Fourier and wavelet decomposition methods. The applications for this technology...relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and Holder exponents , auto-regressive...different scales and how these scales relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and

  15. Three Dimensional Object Recognition Using an Unsupervised Neural Network: Understanding the Distinguishing Features

    DTIC Science & Technology

    1992-12-23

    predominance of structural models of recognition, of which a recent example is the Recognition By Components (RBC) theory ( Biederman , 1987 ). Structural...related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived from a biologically motivated computational theory (Bienenstock et...dimensional object recognition (Intrator and Gold, 1991). The method is related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived

  16. Statistically Modeling I-V Characteristics of CNT-FET with LASSO

    NASA Astrophysics Data System (ADS)

    Ma, Dongsheng; Ye, Zuochang; Wang, Yan

    2017-08-01

    With the advent of internet of things (IOT), the need for studying new material and devices for various applications is increasing. Traditionally we build compact models for transistors on the basis of physics. But physical models are expensive and need a very long time to adjust for non-ideal effects. As the vision for the application of many novel devices is not certain or the manufacture process is not mature, deriving generalized accurate physical models for such devices is very strenuous, whereas statistical modeling is becoming a potential method because of its data oriented property and fast implementation. In this paper, one classical statistical regression method, LASSO, is used to model the I-V characteristics of CNT-FET and a pseudo-PMOS inverter simulation based on the trained model is implemented in Cadence. The normalized relative mean square prediction error of the trained model versus experiment sample data and the simulation results show that the model is acceptable for digital circuit static simulation. And such modeling methodology can extend to general devices.

  17. Clusters in the distribution of pulsars in period, pulse-width, and age. [statistical analysis/statistical distributions

    NASA Technical Reports Server (NTRS)

    Baker, K. B.; Sturrock, P. A.

    1975-01-01

    The question of whether pulsars form a single group or whether pulsars come in two or more different groups is discussed. It is proposed that such groups might be related to several factors such as the initial creation of the neutron star, or the orientation of the magnetic field axis with the spin axis. Various statistical models are examined.

  18. Statistical downscaling of general-circulation-model- simulated average monthly air temperature to the beginning of flowering of the dandelion (Taraxacum officinale) in Slovenia

    NASA Astrophysics Data System (ADS)

    Bergant, Klemen; Kajfež-Bogataj, Lučka; Črepinšek, Zalika

    2002-02-01

    Phenological observations are a valuable source of information for investigating the relationship between climate variation and plant development. Potential climate change in the future will shift the occurrence of phenological phases. Information about future climate conditions is needed in order to estimate this shift. General circulation models (GCM) provide the best information about future climate change. They are able to simulate reliably the most important mean features on a large scale, but they fail on a regional scale because of their low spatial resolution. A common approach to bridging the scale gap is statistical downscaling, which was used to relate the beginning of flowering of Taraxacum officinale in Slovenia with the monthly mean near-surface air temperature for January, February and March in Central Europe. Statistical models were developed and tested with NCAR/NCEP Reanalysis predictor data and EARS predictand data for the period 1960-1999. Prior to developing statistical models, empirical orthogonal function (EOF) analysis was employed on the predictor data. Multiple linear regression was used to relate the beginning of flowering with expansion coefficients of the first three EOF for the Janauary, Febrauary and March air temperatures, and a strong correlation was found between them. Developed statistical models were employed on the results of two GCM (HadCM3 and ECHAM4/OPYC3) to estimate the potential shifts in the beginning of flowering for the periods 1990-2019 and 2020-2049 in comparison with the period 1960-1989. The HadCM3 model predicts, on average, 4 days earlier occurrence and ECHAM4/OPYC3 5 days earlier occurrence of flowering in the period 1990-2019. The analogous results for the period 2020-2049 are a 10- and 11-day earlier occurrence.

  19. Progress of statistical analysis in biomedical research through the historical review of the development of the Framingham score.

    PubMed

    Ignjatović, Aleksandra; Stojanović, Miodrag; Milošević, Zoran; Anđelković Apostolović, Marija

    2017-12-02

    The interest in developing risk models in medicine not only is appealing, but also associated with many obstacles in different aspects of predictive model development. Initially, the association of biomarkers or the association of more markers with the specific outcome was proven by statistical significance, but novel and demanding questions required the development of new and more complex statistical techniques. Progress of statistical analysis in biomedical research can be observed the best through the history of the Framingham study and development of the Framingham score. Evaluation of predictive models comes from a combination of the facts which are results of several metrics. Using logistic regression and Cox proportional hazards regression analysis, the calibration test, and the ROC curve analysis should be mandatory and eliminatory, and the central place should be taken by some new statistical techniques. In order to obtain complete information related to the new marker in the model, recently, there is a recommendation to use the reclassification tables by calculating the net reclassification index and the integrated discrimination improvement. Decision curve analysis is a novel method for evaluating the clinical usefulness of a predictive model. It may be noted that customizing and fine-tuning of the Framingham risk score initiated the development of statistical analysis. Clinically applicable predictive model should be a trade-off between all abovementioned statistical metrics, a trade-off between calibration and discrimination, accuracy and decision-making, costs and benefits, and quality and quantity of patient's life.

  20. VMT-based traffic impact assessment : development of a trip length model.

    DOT National Transportation Integrated Search

    2010-06-01

    This report develops models that relate the trip-lengths to the land-use characteristics at : the trip-ends (both production- and attraction-ends). Separate models were developed by trip : purpose. The results indicate several statistically significa...

  1. Solar granulation and statistical crystallography: A modeling approach using size-shape relations

    NASA Technical Reports Server (NTRS)

    Noever, D. A.

    1994-01-01

    The irregular polygonal pattern of solar granulation is analyzed for size-shape relations using statistical crystallography. In contrast to previous work which has assumed perfectly hexagonal patterns for granulation, more realistic accounting of cell (granule) shapes reveals a broader basis for quantitative analysis. Several features emerge as noteworthy: (1) a linear correlation between number of cell-sides and neighboring shapes (called Aboav-Weaire's law); (2) a linear correlation between both average cell area and perimeter and the number of cell-sides (called Lewis's law and a perimeter law, respectively) and (3) a linear correlation between cell area and squared perimeter (called convolution index). This statistical picture of granulation is consistent with a finding of no correlation in cell shapes beyond nearest neighbors. A comparative calculation between existing model predictions taken from luminosity data and the present analysis shows substantial agreements for cell-size distributions. A model for understanding grain lifetimes is proposed which links convective times to cell shape using crystallographic results.

  2. Statistics-related and reliability-physics-related failure processes in electronics devices and products

    NASA Astrophysics Data System (ADS)

    Suhir, E.

    2014-05-01

    The well known and widely used experimental reliability "passport" of a mass manufactured electronic or a photonic product — the bathtub curve — reflects the combined contribution of the statistics-related and reliability-physics (physics-of-failure)-related processes. When time progresses, the first process results in a decreasing failure rate, while the second process associated with the material aging and degradation leads to an increased failure rate. An attempt has been made in this analysis to assess the level of the reliability physics-related aging process from the available bathtub curve (diagram). It is assumed that the products of interest underwent the burn-in testing and therefore the obtained bathtub curve does not contain the infant mortality portion. It has been also assumed that the two random processes in question are statistically independent, and that the failure rate of the physical process can be obtained by deducting the theoretically assessed statistical failure rate from the bathtub curve ordinates. In the carried out numerical example, the Raleigh distribution for the statistical failure rate was used, for the sake of a relatively simple illustration. The developed methodology can be used in reliability physics evaluations, when there is a need to better understand the roles of the statistics-related and reliability-physics-related irreversible random processes in reliability evaluations. The future work should include investigations on how powerful and flexible methods and approaches of the statistical mechanics can be effectively employed, in addition to reliability physics techniques, to model the operational reliability of electronic and photonic products.

  3. Equilibrium statistical-thermal models in high-energy physics

    NASA Astrophysics Data System (ADS)

    Tawfik, Abdel Nasser

    2014-05-01

    We review some recent highlights from the applications of statistical-thermal models to different experimental measurements and lattice QCD thermodynamics that have been made during the last decade. We start with a short review of the historical milestones on the path of constructing statistical-thermal models for heavy-ion physics. We discovered that Heinz Koppe formulated in 1948, an almost complete recipe for the statistical-thermal models. In 1950, Enrico Fermi generalized this statistical approach, in which he started with a general cross-section formula and inserted into it, the simplifying assumptions about the matrix element of the interaction process that likely reflects many features of the high-energy reactions dominated by density in the phase space of final states. In 1964, Hagedorn systematically analyzed the high-energy phenomena using all tools of statistical physics and introduced the concept of limiting temperature based on the statistical bootstrap model. It turns to be quite often that many-particle systems can be studied with the help of statistical-thermal methods. The analysis of yield multiplicities in high-energy collisions gives an overwhelming evidence for the chemical equilibrium in the final state. The strange particles might be an exception, as they are suppressed at lower beam energies. However, their relative yields fulfill statistical equilibrium, as well. We review the equilibrium statistical-thermal models for particle production, fluctuations and collective flow in heavy-ion experiments. We also review their reproduction of the lattice QCD thermodynamics at vanishing and finite chemical potential. During the last decade, five conditions have been suggested to describe the universal behavior of the chemical freeze-out parameters. The higher order moments of multiplicity have been discussed. They offer deep insights about particle production and to critical fluctuations. Therefore, we use them to describe the freeze-out parameters and suggest the location of the QCD critical endpoint. Various extensions have been proposed in order to take into consideration the possible deviations of the ideal hadron gas. We highlight various types of interactions, dissipative properties and location-dependences (spatial rapidity). Furthermore, we review three models combining hadronic with partonic phases; quasi-particle model, linear sigma model with Polyakov potentials and compressible bag model.

  4. Performance of Reclassification Statistics in Comparing Risk Prediction Models

    PubMed Central

    Paynter, Nina P.

    2012-01-01

    Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information. PMID:21294152

  5. Does competition improve financial stability of the banking sector in ASEAN countries? An empirical analysis.

    PubMed

    Noman, Abu Hanifa Md; Gee, Chan Sok; Isa, Che Ruhana

    2017-01-01

    This study examines the influence of competition on the financial stability of the commercial banks of Association of Southeast Asian Nation (ASEAN) over the 1990 to 2014 period. Panzar-Rosse H-statistic, Lerner index and Herfindahl-Hirschman Index (HHI) are used as measures of competition, while Z-score, non-performing loan (NPL) ratio and equity ratio are used as measures of financial stability. Two-step system Generalized Method of Moments (GMM) estimates demonstrate that competition measured by H-statistic is positively related to Z-score and equity ratio, and negatively related to non-performing loan ratio. Conversely, market power measured by Lerner index is negatively related to Z-score and equity ratio and positively related to NPL ratio. These results strongly support the competition-stability view for ASEAN banks. We also capture the non-linear relationship between competition and financial stability by incorporating a quadratic term of competition in our models. The results show that the coefficient of the quadratic term of H-statistic is negative for the Z-score model given a positive coefficient of the linear term in the same model. These results support the non-linear relationship between competition and financial stability of the banking sector. The study contains significant policy implications for improving the financial stability of the commercial banks.

  6. Does competition improve financial stability of the banking sector in ASEAN countries? An empirical analysis

    PubMed Central

    Gee, Chan Sok; Isa, Che Ruhana

    2017-01-01

    This study examines the influence of competition on the financial stability of the commercial banks of Association of Southeast Asian Nation (ASEAN) over the 1990 to 2014 period. Panzar-Rosse H-statistic, Lerner index and Herfindahl-Hirschman Index (HHI) are used as measures of competition, while Z-score, non-performing loan (NPL) ratio and equity ratio are used as measures of financial stability. Two-step system Generalized Method of Moments (GMM) estimates demonstrate that competition measured by H-statistic is positively related to Z-score and equity ratio, and negatively related to non-performing loan ratio. Conversely, market power measured by Lerner index is negatively related to Z-score and equity ratio and positively related to NPL ratio. These results strongly support the competition-stability view for ASEAN banks. We also capture the non-linear relationship between competition and financial stability by incorporating a quadratic term of competition in our models. The results show that the coefficient of the quadratic term of H-statistic is negative for the Z-score model given a positive coefficient of the linear term in the same model. These results support the non-linear relationship between competition and financial stability of the banking sector. The study contains significant policy implications for improving the financial stability of the commercial banks. PMID:28486548

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kogalovskii, M.R.

    This paper presents a review of problems related to statistical database systems, which are wide-spread in various fields of activity. Statistical databases (SDB) are referred to as databases that consist of data and are used for statistical analysis. Topics under consideration are: SDB peculiarities, properties of data models adequate for SDB requirements, metadata functions, null-value problems, SDB compromise protection problems, stored data compression techniques, and statistical data representation means. Also examined is whether the present Database Management Systems (DBMS) satisfy the SDB requirements. Some actual research directions in SDB systems are considered.

  8. Statistical genetics concepts and approaches in schizophrenia and related neuropsychiatric research.

    PubMed

    Schork, Nicholas J; Greenwood, Tiffany A; Braff, David L

    2007-01-01

    Statistical genetics is a research field that focuses on mathematical models and statistical inference methodologies that relate genetic variations (ie, naturally occurring human DNA sequence variations or "polymorphisms") to particular traits or diseases (phenotypes) usually from data collected on large samples of families or individuals. The ultimate goal of such analysis is the identification of genes and genetic variations that influence disease susceptibility. Although of extreme interest and importance, the fact that many genes and environmental factors contribute to neuropsychiatric diseases of public health importance (eg, schizophrenia, bipolar disorder, and depression) complicates relevant studies and suggests that very sophisticated mathematical and statistical modeling may be required. In addition, large-scale contemporary human DNA sequencing and related projects, such as the Human Genome Project and the International HapMap Project, as well as the development of high-throughput DNA sequencing and genotyping technologies have provided statistical geneticists with a great deal of very relevant and appropriate information and resources. Unfortunately, the use of these resources and their interpretation are not straightforward when applied to complex, multifactorial diseases such as schizophrenia. In this brief and largely nonmathematical review of the field of statistical genetics, we describe many of the main concepts, definitions, and issues that motivate contemporary research. We also provide a discussion of the most pressing contemporary problems that demand further research if progress is to be made in the identification of genes and genetic variations that predispose to complex neuropsychiatric diseases.

  9. Relative mass distributions of neutron-rich thermally fissile nuclei within a statistical model

    NASA Astrophysics Data System (ADS)

    Kumar, Bharat; Kannan, M. T. Senthil; Balasubramaniam, M.; Agrawal, B. K.; Patra, S. K.

    2017-09-01

    We study the binary mass distribution for the recently predicted thermally fissile neutron-rich uranium and thorium nuclei using a statistical model. The level density parameters needed for the study are evaluated from the excitation energies of the temperature-dependent relativistic mean field formalism. The excitation energy and the level density parameter for a given temperature are employed in the convolution integral method to obtain the probability of the particular fragmentation. As representative cases, we present the results for the binary yields of 250U and 254Th. The relative yields are presented for three different temperatures: T =1 , 2, and 3 MeV.

  10. New Probe of Departures from General Relativity Using Minkowski Functionals.

    PubMed

    Fang, Wenjuan; Li, Baojiu; Zhao, Gong-Bo

    2017-05-05

    The morphological properties of the large scale structure of the Universe can be fully described by four Minkowski functionals (MFs), which provide important complementary information to other statistical observables such as the widely used 2-point statistics in configuration and Fourier spaces. In this work, for the first time, we present the differences in the morphology of the large scale structure caused by modifications to general relativity (to address the cosmic acceleration problem), by measuring the MFs from N-body simulations of modified gravity and general relativity. We find strong statistical power when using the MFs to constrain modified theories of gravity: with a galaxy survey that has survey volume ∼0.125(h^{-1}  Gpc)^{3} and galaxy number density ∼1/(h^{-1}  Mpc)^{3}, the two normal-branch Dvali-Gabadadze-Porrati models and the F5 f(R) model that we simulated can be discriminated from the ΛCDM model at a significance level ≳5σ with an individual MF measurement. Therefore, the MF of the large scale structure is potentially a powerful probe of gravity, and its application to real data deserves active exploration.

  11. Applying quantitative adiposity feature analysis models to predict benefit of bevacizumab-based chemotherapy in ovarian cancer patients

    NASA Astrophysics Data System (ADS)

    Wang, Yunzhi; Qiu, Yuchen; Thai, Theresa; More, Kathleen; Ding, Kai; Liu, Hong; Zheng, Bin

    2016-03-01

    How to rationally identify epithelial ovarian cancer (EOC) patients who will benefit from bevacizumab or other antiangiogenic therapies is a critical issue in EOC treatments. The motivation of this study is to quantitatively measure adiposity features from CT images and investigate the feasibility of predicting potential benefit of EOC patients with or without receiving bevacizumab-based chemotherapy treatment using multivariate statistical models built based on quantitative adiposity image features. A dataset involving CT images from 59 advanced EOC patients were included. Among them, 32 patients received maintenance bevacizumab after primary chemotherapy and the remaining 27 patients did not. We developed a computer-aided detection (CAD) scheme to automatically segment subcutaneous fat areas (VFA) and visceral fat areas (SFA) and then extracted 7 adiposity-related quantitative features. Three multivariate data analysis models (linear regression, logistic regression and Cox proportional hazards regression) were performed respectively to investigate the potential association between the model-generated prediction results and the patients' progression-free survival (PFS) and overall survival (OS). The results show that using all 3 statistical models, a statistically significant association was detected between the model-generated results and both of the two clinical outcomes in the group of patients receiving maintenance bevacizumab (p<0.01), while there were no significant association for both PFS and OS in the group of patients without receiving maintenance bevacizumab. Therefore, this study demonstrated the feasibility of using quantitative adiposity-related CT image features based statistical prediction models to generate a new clinical marker and predict the clinical outcome of EOC patients receiving maintenance bevacizumab-based chemotherapy.

  12. Cancer Related-Knowledge - Small Area Estimates

    Cancer.gov

    These model-based estimates are produced using statistical models that combine data from the Health Information National Trends Survey, and auxiliary variables obtained from relevant sources and borrow strength from other areas with similar characteristics.

  13. Plan Recognition using Statistical Relational Models

    DTIC Science & Technology

    2014-08-25

    arguments. Section 4 describes several variants of MLNs for plan recognition. All MLN mod- els were implemented using Alchemy (Kok et al., 2010), an...For both MLN approaches, we used MC-SAT (Poon and Domingos, 2006) as implemented in the Alchemy system on both Monroe and Linux. Evaluation Metric We...Singla P, Poon H, Lowd D, Wang J, Nath A, Domingos P. The Alchemy System for Statistical Relational AI. Techni- cal Report; Department of Computer Science

  14. Occupational Decision-Related Processes for Amotivated Adolescents: Confirmation of a Model

    ERIC Educational Resources Information Center

    Jung, Jae Yup; McCormick, John

    2011-01-01

    This study developed and (statistically) confirmed a new model of the occupational decision-related processes of adolescents, in terms of the extent to which they may be amotivated about choosing a future occupation. A theoretical framework guided the study. A questionnaire that had previously been administered to an Australian adolescent sample…

  15. On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis.

    PubMed

    Li, Bing; Chun, Hyonho; Zhao, Hongyu

    2014-09-01

    We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis.

  16. Data-Driven Learning of Q-Matrix

    PubMed Central

    Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2013-01-01

    The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known Q-matrix, which specifies the item–attribute relationships. This article proposes a data-driven approach to identification of the Q-matrix and estimation of related model parameters. A key ingredient is a flexible T-matrix that relates the Q-matrix to response patterns. The flexibility of the T-matrix allows the construction of a natural criterion function as well as a computationally amenable algorithm. Simulations results are presented to demonstrate usefulness and applicability of the proposed method. Extension to handling of the Q-matrix with partial information is presented. The proposed method also provides a platform on which important statistical issues, such as hypothesis testing and model selection, may be formally addressed. PMID:23926363

  17. Bladder cancer mapping in Libya based on standardized morbidity ratio and log-normal model

    NASA Astrophysics Data System (ADS)

    Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley

    2017-05-01

    Disease mapping contains a set of statistical techniques that detail maps of rates based on estimated mortality, morbidity, and prevalence. A traditional approach to measure the relative risk of the disease is called Standardized Morbidity Ratio (SMR). It is the ratio of an observed and expected number of accounts in an area, which has the greatest uncertainty if the disease is rare or if geographical area is small. Therefore, Bayesian models or statistical smoothing based on Log-normal model are introduced which might solve SMR problem. This study estimates the relative risk for bladder cancer incidence in Libya from 2006 to 2007 based on the SMR and log-normal model, which were fitted to data using WinBUGS software. This study starts with a brief review of these models, starting with the SMR method and followed by the log-normal model, which is then applied to bladder cancer incidence in Libya. All results are compared using maps and tables. The study concludes that the log-normal model gives better relative risk estimates compared to the classical method. The log-normal model has can overcome the SMR problem when there is no observed bladder cancer in an area.

  18. Landau's statistical mechanics for quasi-particle models

    NASA Astrophysics Data System (ADS)

    Bannur, Vishnu M.

    2014-04-01

    Landau's formalism of statistical mechanics [following L. D. Landau and E. M. Lifshitz, Statistical Physics (Pergamon Press, Oxford, 1980)] is applied to the quasi-particle model of quark-gluon plasma. Here, one starts from the expression for pressure and develop all thermodynamics. It is a general formalism and consistent with our earlier studies [V. M. Bannur, Phys. Lett. B647, 271 (2007)] based on Pathria's formalism [following R. K. Pathria, Statistical Mechanics (Butterworth-Heinemann, Oxford, 1977)]. In Pathria's formalism, one starts from the expression for energy density and develop thermodynamics. Both the formalisms are consistent with thermodynamics and statistical mechanics. Under certain conditions, which are wrongly called thermodynamic consistent relation, we recover other formalism of quasi-particle system, like in M. I. Gorenstein and S. N. Yang, Phys. Rev. D52, 5206 (1995), widely studied in quark-gluon plasma.

  19. Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

    PubMed Central

    McDermott, Josh H.; Simoncelli, Eero P.

    2014-01-01

    Rainstorms, insect swarms, and galloping horses produce “sound textures” – the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures. However, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation. PMID:21903084

  20. Statistical wind analysis for near-space applications

    NASA Astrophysics Data System (ADS)

    Roney, Jason A.

    2007-09-01

    Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.

  1. Path statistics, memory, and coarse-graining of continuous-time random walks on networks

    PubMed Central

    Kion-Crosby, Willow; Morozov, Alexandre V.

    2015-01-01

    Continuous-time random walks (CTRWs) on discrete state spaces, ranging from regular lattices to complex networks, are ubiquitous across physics, chemistry, and biology. Models with coarse-grained states (for example, those employed in studies of molecular kinetics) or spatial disorder can give rise to memory and non-exponential distributions of waiting times and first-passage statistics. However, existing methods for analyzing CTRWs on complex energy landscapes do not address these effects. Here we use statistical mechanics of the nonequilibrium path ensemble to characterize first-passage CTRWs on networks with arbitrary connectivity, energy landscape, and waiting time distributions. Our approach can be applied to calculating higher moments (beyond the mean) of path length, time, and action, as well as statistics of any conservative or non-conservative force along a path. For homogeneous networks, we derive exact relations between length and time moments, quantifying the validity of approximating a continuous-time process with its discrete-time projection. For more general models, we obtain recursion relations, reminiscent of transfer matrix and exact enumeration techniques, to efficiently calculate path statistics numerically. We have implemented our algorithm in PathMAN (Path Matrix Algorithm for Networks), a Python script that users can apply to their model of choice. We demonstrate the algorithm on a few representative examples which underscore the importance of non-exponential distributions, memory, and coarse-graining in CTRWs. PMID:26646868

  2. Dynamic modelling of n-of-1 data: powerful and flexible data analytics applied to individualised studies.

    PubMed

    Vieira, Rute; McDonald, Suzanne; Araújo-Soares, Vera; Sniehotta, Falko F; Henderson, Robin

    2017-09-01

    N-of-1 studies are based on repeated observations within an individual or unit over time and are acknowledged as an important research method for generating scientific evidence about the health or behaviour of an individual. Statistical analyses of n-of-1 data require accurate modelling of the outcome while accounting for its distribution, time-related trend and error structures (e.g., autocorrelation) as well as reporting readily usable contextualised effect sizes for decision-making. A number of statistical approaches have been documented but no consensus exists on which method is most appropriate for which type of n-of-1 design. We discuss the statistical considerations for analysing n-of-1 studies and briefly review some currently used methodologies. We describe dynamic regression modelling as a flexible and powerful approach, adaptable to different types of outcomes and capable of dealing with the different challenges inherent to n-of-1 statistical modelling. Dynamic modelling borrows ideas from longitudinal and event history methodologies which explicitly incorporate the role of time and the influence of past on future. We also present an illustrative example of the use of dynamic regression on monitoring physical activity during the retirement transition. Dynamic modelling has the potential to expand researchers' access to robust and user-friendly statistical methods for individualised studies.

  3. Hydrological responses to dynamically and statistically downscaled climate model output

    USGS Publications Warehouse

    Wilby, R.L.; Hay, L.E.; Gutowski, W.J.; Arritt, R.W.; Takle, E.S.; Pan, Z.; Leavesley, G.H.; Clark, M.P.

    2000-01-01

    Daily rainfall and surface temperature series were simulated for the Animas River basin, Colorado using dynamically and statistically downscaled output from the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) re-analysis. A distributed hydrological model was then applied to the downscaled data. Relative to raw NCEP output, downscaled climate variables provided more realistic stimulations of basin scale hydrology. However, the results highlight the sensitivity of modeled processes to the choice of downscaling technique, and point to the need for caution when interpreting future hydrological scenarios.

  4. Constructing and Modifying Sequence Statistics for relevent Using informR in 𝖱

    PubMed Central

    Marcum, Christopher Steven; Butts, Carter T.

    2015-01-01

    The informR package greatly simplifies the analysis of complex event histories in 𝖱 by providing user friendly tools to build sufficient statistics for the relevent package. Historically, building sufficient statistics to model event sequences (of the form a→b) using the egocentric generalization of Butts’ (2008) relational event framework for modeling social action has been cumbersome. The informR package simplifies the construction of the complex list of arrays needed by the rem() model fitting for a variety of cases involving egocentric event data, multiple event types, and/or support constraints. This paper introduces these tools using examples from real data extracted from the American Time Use Survey. PMID:26185488

  5. Star-triangle and star-star relations in statistical mechanics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baxter, R.J.

    1997-01-20

    The homogeneous three-layer Zamolodchikov model is equivalent to a four-state model on the checkerboard lattice which closely resembles the four-state critical Potts model, but with some of its Boltzmann weights negated. Here the author shows that it satisfies a star-to-reverse-star (or simply star-star) relation, even though they know of no star-triangle relation for this model. For any nearest-neighbor checkerboard model, they show that this star-star relation is sufficient to ensure that the decimated model (where half the spins have been summed over) satisfies a twisted Yang-Baxter relation. This ensures that the transfer matrices of the original model commute in pairs,more » which is an adequate condition for solvability.« less

  6. Global Sensitivity Analysis of Environmental Systems via Multiple Indices based on Statistical Moments of Model Outputs

    NASA Astrophysics Data System (ADS)

    Guadagnini, A.; Riva, M.; Dell'Oca, A.

    2017-12-01

    We propose to ground sensitivity of uncertain parameters of environmental models on a set of indices based on the main (statistical) moments, i.e., mean, variance, skewness and kurtosis, of the probability density function (pdf) of a target model output. This enables us to perform Global Sensitivity Analysis (GSA) of a model in terms of multiple statistical moments and yields a quantification of the impact of model parameters on features driving the shape of the pdf of model output. Our GSA approach includes the possibility of being coupled with the construction of a reduced complexity model that allows approximating the full model response at a reduced computational cost. We demonstrate our approach through a variety of test cases. These include a commonly used analytical benchmark, a simplified model representing pumping in a coastal aquifer, a laboratory-scale tracer experiment, and the migration of fracturing fluid through a naturally fractured reservoir (source) to reach an overlying formation (target). Our strategy allows discriminating the relative importance of model parameters to the four statistical moments considered. We also provide an appraisal of the error associated with the evaluation of our sensitivity metrics by replacing the original system model through the selected surrogate model. Our results suggest that one might need to construct a surrogate model with increasing level of accuracy depending on the statistical moment considered in the GSA. The methodological framework we propose can assist the development of analysis techniques targeted to model calibration, design of experiment, uncertainty quantification and risk assessment.

  7. Removing an intersubject variance component in a general linear model improves multiway factoring of event-related spectral perturbations in group EEG studies.

    PubMed

    Spence, Jeffrey S; Brier, Matthew R; Hart, John; Ferree, Thomas C

    2013-03-01

    Linear statistical models are used very effectively to assess task-related differences in EEG power spectral analyses. Mixed models, in particular, accommodate more than one variance component in a multisubject study, where many trials of each condition of interest are measured on each subject. Generally, intra- and intersubject variances are both important to determine correct standard errors for inference on functions of model parameters, but it is often assumed that intersubject variance is the most important consideration in a group study. In this article, we show that, under common assumptions, estimates of some functions of model parameters, including estimates of task-related differences, are properly tested relative to the intrasubject variance component only. A substantial gain in statistical power can arise from the proper separation of variance components when there is more than one source of variability. We first develop this result analytically, then show how it benefits a multiway factoring of spectral, spatial, and temporal components from EEG data acquired in a group of healthy subjects performing a well-studied response inhibition task. Copyright © 2011 Wiley Periodicals, Inc.

  8. Halo models of HI selected galaxies

    NASA Astrophysics Data System (ADS)

    Paul, Niladri; Choudhury, Tirthankar Roy; Paranjape, Aseem

    2018-06-01

    Modelling the distribution of neutral hydrogen (HI) in dark matter halos is important for studying galaxy evolution in the cosmological context. We use a novel approach to infer the HI-dark matter connection at the massive end (m_H{I} > 10^{9.8} M_{⊙}) from radio HI emission surveys, using optical properties of low-redshift galaxies as an intermediary. In particular, we use a previously calibrated optical HOD describing the luminosity- and colour-dependent clustering of SDSS galaxies and describe the HI content using a statistical scaling relation between the optical properties and HI mass. This allows us to compute the abundance and clustering properties of HI-selected galaxies and compare with data from the ALFALFA survey. We apply an MCMC-based statistical analysis to constrain the free parameters related to the scaling relation. The resulting best-fit scaling relation identifies massive HI galaxies primarily with optically faint blue centrals, consistent with expectations from galaxy formation models. We compare the Hi-stellar mass relation predicted by our model with independent observations from matched Hi-optical galaxy samples, finding reasonable agreement. As a further application, we make some preliminary forecasts for future observations of HI and optical galaxies in the expected overlap volume of SKA and Euclid/LSST.

  9. Unified risk analysis of fatigue failure in ductile alloy components during all three stages of fatigue crack evolution process.

    PubMed

    Patankar, Ravindra

    2003-10-01

    Statistical fatigue life of a ductile alloy specimen is traditionally divided into three stages, namely, crack nucleation, small crack growth, and large crack growth. Crack nucleation and small crack growth show a wide variation and hence a big spread on cycles versus crack length graph. Relatively, large crack growth shows a lesser variation. Therefore, different models are fitted to the different stages of the fatigue evolution process, thus treating different stages as different phenomena. With these independent models, it is impossible to predict one phenomenon based on the information available about the other phenomenon. Experimentally, it is easier to carry out crack length measurements of large cracks compared to nucleating cracks and small cracks. Thus, it is easier to collect statistical data for large crack growth compared to the painstaking effort it would take to collect statistical data for crack nucleation and small crack growth. This article presents a fracture mechanics-based stochastic model of fatigue crack growth in ductile alloys that are commonly encountered in mechanical structures and machine components. The model has been validated by Ray (1998) for crack propagation by various statistical fatigue data. Based on the model, this article proposes a technique to predict statistical information of fatigue crack nucleation and small crack growth properties that uses the statistical properties of large crack growth under constant amplitude stress excitation. The statistical properties of large crack growth under constant amplitude stress excitation can be obtained via experiments.

  10. A phylogenetic transform enhances analysis of compositional microbiota data.

    PubMed

    Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

    2017-02-15

    Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.

  11. Rigorous force field optimization principles based on statistical distance minimization

    DOE PAGES

    Vlcek, Lukas; Chialvo, Ariel A.

    2015-10-12

    We use the concept of statistical distance to define a measure of distinguishability between a pair of statistical mechanical systems, i.e., a model and its target, and show that its minimization leads to general convergence of the model’s static measurable properties to those of the target. Here we exploit this feature to define a rigorous basis for the development of accurate and robust effective molecular force fields that are inherently compatible with coarse-grained experimental data. The new model optimization principles and their efficient implementation are illustrated through selected examples, whose outcome demonstrates the higher robustness and predictive accuracy of themore » approach compared to other currently used methods, such as force matching and relative entropy minimization. We also discuss relations between the newly developed principles and established thermodynamic concepts, which include the Gibbs-Bogoliubov inequality and the thermodynamic length.« less

  12. Forecasting runout of rock and debris avalanches

    USGS Publications Warehouse

    Iverson, Richard M.; Evans, S.G.; Mugnozza, G.S.; Strom, A.; Hermanns, R.L.

    2006-01-01

    Physically based mathematical models and statistically based empirical equations each may provide useful means of forecasting runout of rock and debris avalanches. This paper compares the foundations, strengths, and limitations of a physically based model and a statistically based forecasting method, both of which were developed to predict runout across three-dimensional topography. The chief advantage of the physically based model results from its ties to physical conservation laws and well-tested axioms of soil and rock mechanics, such as the Coulomb friction rule and effective-stress principle. The output of this model provides detailed information about the dynamics of avalanche runout, at the expense of high demands for accurate input data, numerical computation, and experimental testing. In comparison, the statistical method requires relatively modest computation and no input data except identification of prospective avalanche source areas and a range of postulated avalanche volumes. Like the physically based model, the statistical method yields maps of predicted runout, but it provides no information on runout dynamics. Although the two methods differ significantly in their structure and objectives, insights gained from one method can aid refinement of the other.

  13. A statistical mechanics model for free-for-all airplane passenger boarding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steffen, Jason H.; /Fermilab

    2008-08-01

    I discuss a model for free-for-all passenger boarding which is employed by some discount air carriers. The model is based on the principles of statistical mechanics where each seat in the aircraft has an associated energy which reflects the preferences of travelers. As each passenger enters the airplane they select their seats using Boltzmann statistics, proceed to that location, load their luggage, sit down, and the partition function seen by remaining passengers is modified to reflect this fact. I discuss the various model parameters and make qualitative comparisons of this passenger boarding model with those that involve assigned seats. Themore » model can be used to predict the probability that certain seats will be occupied at different times during the boarding process. These results might provide a useful description of this boarding method. The model is a relatively unusual application of undergraduate level physics and describes a situation familiar to many students and faculty.« less

  14. Bayesian inference of physiologically meaningful parameters from body sway measurements.

    PubMed

    Tietäväinen, A; Gutmann, M U; Keski-Vakkuri, E; Corander, J; Hæggström, E

    2017-06-19

    The control of the human body sway by the central nervous system, muscles, and conscious brain is of interest since body sway carries information about the physiological status of a person. Several models have been proposed to describe body sway in an upright standing position, however, due to the statistical intractability of the more realistic models, no formal parameter inference has previously been conducted and the expressive power of such models for real human subjects remains unknown. Using the latest advances in Bayesian statistical inference for intractable models, we fitted a nonlinear control model to posturographic measurements, and we showed that it can accurately predict the sway characteristics of both simulated and real subjects. Our method provides a full statistical characterization of the uncertainty related to all model parameters as quantified by posterior probability density functions, which is useful for comparisons across subjects and test settings. The ability to infer intractable control models from sensor data opens new possibilities for monitoring and predicting body status in health applications.

  15. Evaluation of Model Fit in Cognitive Diagnosis Models

    ERIC Educational Resources Information Center

    Hu, Jinxiang; Miller, M. David; Huggins-Manley, Anne Corinne; Chen, Yi-Hsin

    2016-01-01

    Cognitive diagnosis models (CDMs) estimate student ability profiles using latent attributes. Model fit to the data needs to be ascertained in order to determine whether inferences from CDMs are valid. This study investigated the usefulness of some popular model fit statistics to detect CDM fit including relative fit indices (AIC, BIC, and CAIC),…

  16. Verification of statistical method CORN for modeling of microfuel in the case of high grain concentration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chukbar, B. K., E-mail: bchukbar@mail.ru

    Two methods of modeling a double-heterogeneity fuel are studied: the deterministic positioning and the statistical method CORN of the MCU software package. The effect of distribution of microfuel in a pebble bed on the calculation results is studied. The results of verification of the statistical method CORN for the cases of the microfuel concentration up to 170 cm{sup –3} in a pebble bed are presented. The admissibility of homogenization of the microfuel coating with the graphite matrix is studied. The dependence of the reactivity on the relative location of fuel and graphite spheres in a pebble bed is found.

  17. Application of real rock pore-threat statistics to a regular pore network model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rakibul, M.; Sarker, H.; McIntyre, D.

    2011-01-01

    This work reports the application of real rock statistical data to a previously developed regular pore network model in an attempt to produce an accurate simulation tool with low computational overhead. A core plug from the St. Peter Sandstone formation in Indiana was scanned with a high resolution micro CT scanner. The pore-throat statistics of the three-dimensional reconstructed rock were extracted and the distribution of the pore-throat sizes was applied to the regular pore network model. In order to keep the equivalent model regular, only the throat area or the throat radius was varied. Ten realizations of randomly distributed throatmore » sizes were generated to simulate the drainage process and relative permeability was calculated and compared with the experimentally determined values of the original rock sample. The numerical and experimental procedures are explained in detail and the performance of the model in relation to the experimental data is discussed and analyzed. Petrophysical properties such as relative permeability are important in many applied fields such as production of petroleum fluids, enhanced oil recovery, carbon dioxide sequestration, ground water flow, etc. Relative permeability data are used for a wide range of conventional reservoir engineering calculations and in numerical reservoir simulation. Two-phase oil water relative permeability data are generated on the same core plug from both pore network model and experimental procedure. The shape and size of the relative permeability curves were compared and analyzed and good match has been observed for wetting phase relative permeability but for non-wetting phase, simulation results were found to be deviated from the experimental ones. Efforts to determine petrophysical properties of rocks using numerical techniques are to eliminate the necessity of regular core analysis, which can be time consuming and expensive. So a numerical technique is expected to be fast and to produce reliable results. In applied engineering, sometimes quick result with reasonable accuracy is acceptable than the more time consuming results. Present work is an effort to check the accuracy and validity of a previously developed pore network model for obtaining important petrophysical properties of rocks based on cutting-sized sample data.« less

  18. Application of real rock pore-throat statistics to a regular pore network model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sarker, M.R.; McIntyre, D.; Ferer, M.

    2011-01-01

    This work reports the application of real rock statistical data to a previously developed regular pore network model in an attempt to produce an accurate simulation tool with low computational overhead. A core plug from the St. Peter Sandstone formation in Indiana was scanned with a high resolution micro CT scanner. The pore-throat statistics of the three-dimensional reconstructed rock were extracted and the distribution of the pore-throat sizes was applied to the regular pore network model. In order to keep the equivalent model regular, only the throat area or the throat radius was varied. Ten realizations of randomly distributed throatmore » sizes were generated to simulate the drainage process and relative permeability was calculated and compared with the experimentally determined values of the original rock sample. The numerical and experimental procedures are explained in detail and the performance of the model in relation to the experimental data is discussed and analyzed. Petrophysical properties such as relative permeability are important in many applied fields such as production of petroleum fluids, enhanced oil recovery, carbon dioxide sequestration, ground water flow, etc. Relative permeability data are used for a wide range of conventional reservoir engineering calculations and in numerical reservoir simulation. Two-phase oil water relative permeability data are generated on the same core plug from both pore network model and experimental procedure. The shape and size of the relative permeability curves were compared and analyzed and good match has been observed for wetting phase relative permeability but for non-wetting phase, simulation results were found to be deviated from the experimental ones. Efforts to determine petrophysical properties of rocks using numerical techniques are to eliminate the necessity of regular core analysis, which can be time consuming and expensive. So a numerical technique is expected to be fast and to produce reliable results. In applied engineering, sometimes quick result with reasonable accuracy is acceptable than the more time consuming results. Present work is an effort to check the accuracy and validity of a previously developed pore network model for obtaining important petrophysical properties of rocks based on cutting-sized sample data. Introduction« less

  19. Grain-Size Based Additivity Models for Scaling Multi-rate Uranyl Surface Complexation in Subsurface Sediments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.

    This study statistically analyzed a grain-size based additivity model that has been proposed to scale reaction rates and parameters from laboratory to field. The additivity model assumed that reaction properties in a sediment including surface area, reactive site concentration, reaction rate, and extent can be predicted from field-scale grain size distribution by linearly adding reaction properties for individual grain size fractions. This study focused on the statistical analysis of the additivity model with respect to reaction rate constants using multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment as an example. Experimental data of rate-limited U(VI) desorption in amore » stirred flow-cell reactor were used to estimate the statistical properties of multi-rate parameters for individual grain size fractions. The statistical properties of the rate constants for the individual grain size fractions were then used to analyze the statistical properties of the additivity model to predict rate-limited U(VI) desorption in the composite sediment, and to evaluate the relative importance of individual grain size fractions to the overall U(VI) desorption. The result indicated that the additivity model provided a good prediction of the U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model, and U(VI) desorption in individual grain size fractions have to be simulated in order to apply the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel size fraction (2-8mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less

  20. Age related neuromuscular changes in sEMG of m. Tibialis Anterior using higher order statistics (Gaussianity & linearity test).

    PubMed

    Siddiqi, Ariba; Arjunan, Sridhar P; Kumar, Dinesh K

    2016-08-01

    Age-associated changes in the surface electromyogram (sEMG) of Tibialis Anterior (TA) muscle can be attributable to neuromuscular alterations that precede strength loss. We have used our sEMG model of the Tibialis Anterior to interpret the age-related changes and compared with the experimental sEMG. Eighteen young (20-30 years) and 18 older (60-85 years) performed isometric dorsiflexion at 6 different percentage levels of maximum voluntary contractions (MVC), and their sEMG from the TA muscle was recorded. Six different age-related changes in the neuromuscular system were simulated using the sEMG model at the same MVCs as the experiment. The maximal power of the spectrum, Gaussianity and Linearity Test Statistics were computed from the simulated and experimental sEMG. A correlation analysis at α=0.05 was performed between the simulated and experimental age-related change in the sEMG features. The results show the loss in motor units was distinguished by the Gaussianity and Linearity test statistics; while the maximal power of the PSD distinguished between the muscular factors. The simulated condition of 40% loss of motor units with halved the number of fast fibers best correlated with the age-related change observed in the experimental sEMG higher order statistical features. The simulated aging condition found by this study corresponds with the moderate motor unit remodelling and negligible strength loss reported in literature for the cohorts aged 60-70 years.

  1. Modeling Cross-Situational Word–Referent Learning: Prior Questions

    PubMed Central

    Yu, Chen; Smith, Linda B.

    2013-01-01

    Both adults and young children possess powerful statistical computation capabilities—they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of associative learning. This article describes a series of simulation studies and analyses designed to understand the different learning mechanisms posited by the 2 classes of models and their relation to each other. Variants of a hypothesis-testing model and a simple or dumb associative mechanism were examined under different specifications of information selection, computation, and decision. Critically, these 3 components of the models interact in complex ways. The models illustrate a fundamental tradeoff between amount of data input and powerful computations: With the selection of more information, dumb associative models can mimic the powerful learning that is accomplished by hypothesis-testing models with fewer data. However, because of the interactions among the component parts of the models, the associative model can mimic various hypothesis-testing models, producing the same learning patterns but through different internal components. The simulations argue for the importance of a compositional approach to human statistical learning: the experimental decomposition of the processes that contribute to statistical learning in human learners and models with the internal components that can be evaluated independently and together. PMID:22229490

  2. An emission-weighted proximity model for air pollution exposure assessment.

    PubMed

    Zou, Bin; Wilson, J Gaines; Zhan, F Benjamin; Zeng, Yongnian

    2009-08-15

    Among the most common spatial models for estimating personal exposure are Traditional Proximity Models (TPMs). Though TPMs are straightforward to configure and interpret, they are prone to extensive errors in exposure estimates and do not provide prospective estimates. To resolve these inherent problems with TPMs, we introduce here a novel Emission Weighted Proximity Model (EWPM) to improve the TPM, which takes into consideration the emissions from all sources potentially influencing the receptors. EWPM performance was evaluated by comparing the normalized exposure risk values of sulfur dioxide (SO(2)) calculated by EWPM with those calculated by TPM and monitored observations over a one-year period in two large Texas counties. In order to investigate whether the limitations of TPM in potential exposure risk prediction without recorded incidence can be overcome, we also introduce a hybrid framework, a 'Geo-statistical EWPM'. Geo-statistical EWPM is a synthesis of Ordinary Kriging Geo-statistical interpolation and EWPM. The prediction results are presented as two potential exposure risk prediction maps. The performance of these two exposure maps in predicting individual SO(2) exposure risk was validated with 10 virtual cases in prospective exposure scenarios. Risk values for EWPM were clearly more agreeable with the observed concentrations than those from TPM. Over the entire study area, the mean SO(2) exposure risk from EWPM was higher relative to TPM (1.00 vs. 0.91). The mean bias of the exposure risk values of 10 virtual cases between EWPM and 'Geo-statistical EWPM' are much smaller than those between TPM and 'Geo-statistical TPM' (5.12 vs. 24.63). EWPM appears to more accurately portray individual exposure relative to TPM. The 'Geo-statistical EWPM' effectively augments the role of the standard proximity model and makes it possible to predict individual risk in future exposure scenarios resulting in adverse health effects from environmental pollution.

  3. Helping Students Assess the Relative Importance of Different Intermolecular Interactions

    ERIC Educational Resources Information Center

    Jasien, Paul G.

    2008-01-01

    A semi-quantitative model has been developed to estimate the relative effects of dispersion, dipole-dipole interactions, and H-bonding on the normal boiling points ("T[subscript b]") for a subset of simple organic systems. The model is based upon a statistical analysis using multiple linear regression on a series of straight-chain organic…

  4. DOSE-RESPONSE ASSESSMENT FOR DEVELOPMENTAL TOXICITY III. STATISTICAL MODELS

    EPA Science Inventory

    Although quantitative modeling has been central to cancer risk assessment for years, the concept of do@e-response modeling for developmental effects is relatively new. he benchmark dose (BMD) approach has been proposed for use with developmental (as well as other noncancer) endpo...

  5. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  6. Risk prediction model: Statistical and artificial neural network approach

    NASA Astrophysics Data System (ADS)

    Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

    2017-04-01

    Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.

  7. Risk assessment model for development of advanced age-related macular degeneration.

    PubMed

    Klein, Michael L; Francis, Peter J; Ferris, Frederick L; Hamon, Sara C; Clemons, Traci E

    2011-12-01

    To design a risk assessment model for development of advanced age-related macular degeneration (AMD) incorporating phenotypic, demographic, environmental, and genetic risk factors. We evaluated longitudinal data from 2846 participants in the Age-Related Eye Disease Study. At baseline, these individuals had all levels of AMD, ranging from none to unilateral advanced AMD (neovascular or geographic atrophy). Follow-up averaged 9.3 years. We performed a Cox proportional hazards analysis with demographic, environmental, phenotypic, and genetic covariates and constructed a risk assessment model for development of advanced AMD. Performance of the model was evaluated using the C statistic and the Brier score and externally validated in participants in the Complications of Age-Related Macular Degeneration Prevention Trial. The final model included the following independent variables: age, smoking history, family history of AMD (first-degree member), phenotype based on a modified Age-Related Eye Disease Study simple scale score, and genetic variants CFH Y402H and ARMS2 A69S. The model did well on performance measures, with very good discrimination (C statistic = 0.872) and excellent calibration and overall performance (Brier score at 5 years = 0.08). Successful external validation was performed, and a risk assessment tool was designed for use with or without the genetic component. We constructed a risk assessment model for development of advanced AMD. The model performed well on measures of discrimination, calibration, and overall performance and was successfully externally validated. This risk assessment tool is available for online use.

  8. Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.

    PubMed

    Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K

    2018-02-01

    Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.

  9. Statistical Model of Dynamic Markers of the Alzheimer's Pathological Cascade.

    PubMed

    Balsis, Steve; Geraci, Lisa; Benge, Jared; Lowe, Deborah A; Choudhury, Tabina K; Tirso, Robert; Doody, Rachelle S

    2018-05-05

    Alzheimer's disease (AD) is a progressive disease reflected in markers across assessment modalities, including neuroimaging, cognitive testing, and evaluation of adaptive function. Identifying a single continuum of decline across assessment modalities in a single sample is statistically challenging because of the multivariate nature of the data. To address this challenge, we implemented advanced statistical analyses designed specifically to model complex data across a single continuum. We analyzed data from the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 1,056), focusing on indicators from the assessments of magnetic resonance imaging (MRI) volume, fluorodeoxyglucose positron emission tomography (FDG-PET) metabolic activity, cognitive performance, and adaptive function. Item response theory was used to identify the continuum of decline. Then, through a process of statistical scaling, indicators across all modalities were linked to that continuum and analyzed. Findings revealed that measures of MRI volume, FDG-PET metabolic activity, and adaptive function added measurement precision beyond that provided by cognitive measures, particularly in the relatively mild range of disease severity. More specifically, MRI volume, and FDG-PET metabolic activity become compromised in the very mild range of severity, followed by cognitive performance and finally adaptive function. Our statistically derived models of the AD pathological cascade are consistent with existing theoretical models.

  10. Ecological statistics of Gestalt laws for the perceptual organization of contours.

    PubMed

    Elder, James H; Goldberg, Richard M

    2002-01-01

    Although numerous studies have measured the strength of visual grouping cues for controlled psychophysical stimuli, little is known about the statistical utility of these various cues for natural images. In this study, we conducted experiments in which human participants trace perceived contours in natural images. These contours are automatically mapped to sequences of discrete tangent elements detected in the image. By examining relational properties between pairs of successive tangents on these traced curves, and between randomly selected pairs of tangents, we are able to estimate the likelihood distributions required to construct an optimal Bayesian model for contour grouping. We employed this novel methodology to investigate the inferential power of three classical Gestalt cues for contour grouping: proximity, good continuation, and luminance similarity. The study yielded a number of important results: (1) these cues, when appropriately defined, are approximately uncorrelated, suggesting a simple factorial model for statistical inference; (2) moderate image-to-image variation of the statistics indicates the utility of general probabilistic models for perceptual organization; (3) these cues differ greatly in their inferential power, proximity being by far the most powerful; and (4) statistical modeling of the proximity cue indicates a scale-invariant power law in close agreement with prior psychophysics.

  11. Clinical study of the Erlanger silver catheter--data management and biometry.

    PubMed

    Martus, P; Geis, C; Lugauer, S; Böswald, M; Guggenbichler, J P

    1999-01-01

    The clinical evaluation of venous catheters for catheter-induced infections must conform to a strict biometric methodology. The statistical planning of the study (target population, design, degree of blinding), data management (database design, definition of variables, coding), quality assurance (data inspection at several levels) and the biometric evaluation of the Erlanger silver catheter project are described. The three-step data flow included: 1) primary data from the hospital, 2) relational database, 3) files accessible for statistical evaluation. Two different statistical models were compared: analyzing the first catheter only of a patient in the analysis (independent data) and analyzing several catheters from the same patient (dependent data) by means of the generalized estimating equations (GEE) method. The main result of the study was based on the comparison of both statistical models.

  12. Implications of the methodological choices for hydrologic portrayals of climate change over the contiguous United States: Statistically downscaled forcing data and hydrologic models

    USGS Publications Warehouse

    Mizukami, Naoki; Clark, Martyn P.; Gutmann, Ethan D.; Mendoza, Pablo A.; Newman, Andrew J.; Nijssen, Bart; Livneh, Ben; Hay, Lauren E.; Arnold, Jeffrey R.; Brekke, Levi D.

    2016-01-01

    Continental-domain assessments of climate change impacts on water resources typically rely on statistically downscaled climate model outputs to force hydrologic models at a finer spatial resolution. This study examines the effects of four statistical downscaling methods [bias-corrected constructed analog (BCCA), bias-corrected spatial disaggregation applied at daily (BCSDd) and monthly scales (BCSDm), and asynchronous regression (AR)] on retrospective hydrologic simulations using three hydrologic models with their default parameters (the Community Land Model, version 4.0; the Variable Infiltration Capacity model, version 4.1.2; and the Precipitation–Runoff Modeling System, version 3.0.4) over the contiguous United States (CONUS). Biases of hydrologic simulations forced by statistically downscaled climate data relative to the simulation with observation-based gridded data are presented. Each statistical downscaling method produces different meteorological portrayals including precipitation amount, wet-day frequency, and the energy input (i.e., shortwave radiation), and their interplay affects estimations of precipitation partitioning between evapotranspiration and runoff, extreme runoff, and hydrologic states (i.e., snow and soil moisture). The analyses show that BCCA underestimates annual precipitation by as much as −250 mm, leading to unreasonable hydrologic portrayals over the CONUS for all models. Although the other three statistical downscaling methods produce a comparable precipitation bias ranging from −10 to 8 mm across the CONUS, BCSDd severely overestimates the wet-day fraction by up to 0.25, leading to different precipitation partitioning compared to the simulations with other downscaled data. Overall, the choice of downscaling method contributes to less spread in runoff estimates (by a factor of 1.5–3) than the choice of hydrologic model with use of the default parameters if BCCA is excluded.

  13. Computer Administering of the Psychological Investigations: Set-Relational Representation

    NASA Astrophysics Data System (ADS)

    Yordzhev, Krasimir

    Computer administering of a psychological investigation is the computer representation of the entire procedure of psychological assessments - test construction, test implementation, results evaluation, storage and maintenance of the developed database, its statistical processing, analysis and interpretation. A mathematical description of psychological assessment with the aid of personality tests is discussed in this article. The set theory and the relational algebra are used in this description. A relational model of data, needed to design a computer system for automation of certain psychological assessments is given. Some finite sets and relation on them, which are necessary for creating a personality psychological test, are described. The described model could be used to develop real software for computer administering of any psychological test and there is full automation of the whole process: test construction, test implementation, result evaluation, storage of the developed database, statistical implementation, analysis and interpretation. A software project for computer administering personality psychological tests is suggested.

  14. Statistical model to perform error analysis of curve fits of wind tunnel test data using the techniques of analysis of variance and regression analysis

    NASA Technical Reports Server (NTRS)

    Alston, D. W.

    1981-01-01

    The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.

  15. Monitoring Method of Cow Anthrax Based on Gis and Spatial Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Li, Lin; Yang, Yong; Wang, Hongbin; Dong, Jing; Zhao, Yujun; He, Jianbin; Fan, Honggang

    Geographic information system (GIS) is a computer application system, which possesses the ability of manipulating spatial information and has been used in many fields related with the spatial information management. Many methods and models have been established for analyzing animal diseases distribution models and temporal-spatial transmission models. Great benefits have been gained from the application of GIS in animal disease epidemiology. GIS is now a very important tool in animal disease epidemiological research. Spatial analysis function of GIS can be widened and strengthened by using spatial statistical analysis, allowing for the deeper exploration, analysis, manipulation and interpretation of spatial pattern and spatial correlation of the animal disease. In this paper, we analyzed the cow anthrax spatial distribution characteristics in the target district A (due to the secret of epidemic data we call it district A) based on the established GIS of the cow anthrax in this district in combination of spatial statistical analysis and GIS. The Cow anthrax is biogeochemical disease, and its geographical distribution is related closely to the environmental factors of habitats and has some spatial characteristics, and therefore the correct analysis of the spatial distribution of anthrax cow for monitoring and the prevention and control of anthrax has a very important role. However, the application of classic statistical methods in some areas is very difficult because of the pastoral nomadic context. The high mobility of livestock and the lack of enough suitable sampling for the some of the difficulties in monitoring currently make it nearly impossible to apply rigorous random sampling methods. It is thus necessary to develop an alternative sampling method, which could overcome the lack of sampling and meet the requirements for randomness. The GIS computer application software ArcGIS9.1 was used to overcome the lack of data of sampling sites.Using ArcGIS 9.1 and GEODA to analyze the cow anthrax spatial distribution of district A. we gained some conclusions about cow anthrax' density: (1) there is a spatial clustering model. (2) there is an intensely spatial autocorrelation. We established a prediction model to estimate the anthrax distribution based on the spatial characteristic of the density of cow anthrax. Comparing with the true distribution, the prediction model has a well coincidence and is feasible to the application. The method using a GIS tool facilitates can be implemented significantly in the cow anthrax monitoring and investigation, and the space statistics - related prediction model provides a fundamental use for other study on space-related animal diseases.

  16. Quantification of model uncertainty in aerosol optical thickness retrieval from Ozone Monitoring Instrument (OMI) measurements

    NASA Astrophysics Data System (ADS)

    Määttä, A.; Laine, M.; Tamminen, J.; Veefkind, J. P.

    2013-09-01

    We study uncertainty quantification in remote sensing of aerosols in the atmosphere with top of the atmosphere reflectance measurements from the nadir-viewing Ozone Monitoring Instrument (OMI). Focus is on the uncertainty in aerosol model selection of pre-calculated aerosol models and on the statistical modelling of the model inadequacies. The aim is to apply statistical methodologies that improve the uncertainty estimates of the aerosol optical thickness (AOT) retrieval by propagating model selection and model error related uncertainties more realistically. We utilise Bayesian model selection and model averaging methods for the model selection problem and use Gaussian processes to model the smooth systematic discrepancies from the modelled to observed reflectance. The systematic model error is learned from an ensemble of operational retrievals. The operational OMI multi-wavelength aerosol retrieval algorithm OMAERO is used for cloud free, over land pixels of the OMI instrument with the additional Bayesian model selection and model discrepancy techniques. The method is demonstrated with four examples with different aerosol properties: weakly absorbing aerosols, forest fires over Greece and Russia, and Sahara dessert dust. The presented statistical methodology is general; it is not restricted to this particular satellite retrieval application.

  17. Huffman and linear scanning methods with statistical language models.

    PubMed

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  18. Statistical dielectronic recombination rates for multielectron ions in plasma

    NASA Astrophysics Data System (ADS)

    Demura, A. V.; Leont'iev, D. S.; Lisitsa, V. S.; Shurygin, V. A.

    2017-10-01

    We describe the general analytic derivation of the dielectronic recombination (DR) rate coefficient for multielectron ions in a plasma based on the statistical theory of an atom in terms of the spatial distribution of the atomic electron density. The dielectronic recombination rates for complex multielectron tungsten ions are calculated numerically in a wide range of variation of the plasma temperature, which is important for modern nuclear fusion studies. The results of statistical theory are compared with the data obtained using level-by-level codes ADPAK, FAC, HULLAC, and experimental results. We consider different statistical DR models based on the Thomas-Fermi distribution, viz., integral and differential with respect to the orbital angular momenta of the ion core and the trapped electron, as well as the Rost model, which is an analog of the Frank-Condon model as applied to atomic structures. In view of its universality and relative simplicity, the statistical approach can be used for obtaining express estimates of the dielectronic recombination rate coefficients in complex calculations of the parameters of the thermonuclear plasmas. The application of statistical methods also provides information for the dielectronic recombination rates with much smaller computer time expenditures as compared to available level-by-level codes.

  19. Focus on Statistical Physics Modeling in Economics and Finance

    NASA Astrophysics Data System (ADS)

    Mantegna, Rosario N.; Kertész, János

    2011-02-01

    This focus issue presents a collection of papers on recent results in statistical physics modeling in economics and finance, commonly known as econophysics. We touch briefly on the history of this relatively new multi-disciplinary field, summarize the motivations behind its emergence and try to characterize its specific features. We point out some research aspects that must be improved and briefly discuss the topics the research field is moving toward. Finally, we give a short account of the papers collected in this issue.

  20. Detecting Mixtures from Structural Model Differences Using Latent Variable Mixture Modeling: A Comparison of Relative Model Fit Statistics

    ERIC Educational Resources Information Center

    Henson, James M.; Reise, Steven P.; Kim, Kevin H.

    2007-01-01

    The accuracy of structural model parameter estimates in latent variable mixture modeling was explored with a 3 (sample size) [times] 3 (exogenous latent mean difference) [times] 3 (endogenous latent mean difference) [times] 3 (correlation between factors) [times] 3 (mixture proportions) factorial design. In addition, the efficacy of several…

  1. The Structure of Human Intelligence: It Is Verbal, Perceptual, and Image Rotation (VPR), Not Fluid and Crystallized

    ERIC Educational Resources Information Center

    Johnson, W.; Bouchard, T.J.

    2005-01-01

    In a heterogeneous sample of 436 adult individuals who completed 42 mental ability tests, we evaluated the relative statistical performance of three major psychometric models of human intelligence-the Cattell-Horn fluid-crystallized model, Vernon's verbal-perceptual model, and Carroll's three-strata model. The verbal-perceptual model fit…

  2. Associations of Social Support, Friends Only Known Through the Internet, and Health-Related Quality of Life with Internet Gaming Disorder in Adolescence.

    PubMed

    Wartberg, Lutz; Kriston, Levente; Kammerl, Rudolf

    2017-07-01

    Internet Gaming Disorder (IGD) has been included in the current edition of the Diagnostic and Statistical Manual of Mental Disorders-Fifth Edition (DSM-5). In the present study, the relationship among social support, friends only known through the Internet, health-related quality of life, and IGD in adolescence was explored for the first time. For this purpose, 1,095 adolescents aged from 12 to 14 years were surveyed with a standardized questionnaire concerning IGD, self-perceived social support, proportion of friends only known through the Internet, and health-related quality of life. The authors conducted unpaired t-tests, a chi-square test, as well as correlation and logistic regression analyses. According to the statistical analyses, adolescents with IGD reported lower self-perceived social support, more friends only known through the Internet, and a lower health-related quality of life compared with the group without IGD. Both in bivariate and multivariate logistic regression models, statistically significant associations between IGD and male gender, a higher proportion of friends only known through the Internet, and a lower health-related quality of life (multivariate model: Nagelkerke's R 2  = 0.37) were revealed. Lower self-perceived social support was related to IGD in the bivariate model only. In summary, quality of life and social aspects seem to be important factors for IGD in adolescence and therefore should be incorporated in further (longitudinal) studies. The findings of the present survey may provide starting points for the development of prevention and intervention programs for adolescents affected by IGD.

  3. A note about high blood pressure in childhood

    NASA Astrophysics Data System (ADS)

    Teodoro, M. Filomena; Simão, Carla

    2017-06-01

    In medical, behavioral and social sciences it is usual to get a binary outcome. In the present work is collected information where some of the outcomes are binary variables (1='yes'/ 0='no'). In [14] a preliminary study about the caregivers perception of pediatric hypertension was introduced. An experimental questionnaire was designed to be answered by the caregivers of routine pediatric consultation attendees in the Santa Maria's hospital (HSM). The collected data was statistically analyzed, where a descriptive analysis and a predictive model were performed. Significant relations between some socio-demographic variables and the assessed knowledge were obtained. In [14] can be found a statistical data analysis using partial questionnaire's information. The present article completes the statistical approach estimating a model for relevant remaining questions of questionnaire by Generalized Linear Models (GLM). Exploring the binary outcome issue, we intend to extend this approach using Generalized Linear Mixed Models (GLMM), but the process is still ongoing.

  4. Modeling Statistics of Fish Patchiness and Predicting Associated Influence on Statistics of Acoustic Echoes

    DTIC Science & Technology

    2014-12-01

    moving relative to the water in which they are immersed, reflecting the true school movement dynamics . There has also been work to implement this...Engineering Department Woods Hole Oceanographic Institution 98 Water Street, MS #11 Woods Hole, MA 02543 9. SPONSORING/MONITORING AGENCY NAME(S) AND...were measured with multi-beam sonars and quantified in terms of important aspects offish dynamics ; and predictions were made of echo statistics of a

  5. Statistics Graduate Students' Professional Development for Teaching: A Communities of Practice Model

    NASA Astrophysics Data System (ADS)

    Justice, Nicola

    Graduate teaching assistants (GTAs) are responsible for instructing approximately 25% of introductory statistics courses in the United States (Blair, Kirkman, & Maxwell, 2013). Most research on GTA professional development focuses on structured activities (e.g., courses, workshops) that have been developed to improve GTAs' pedagogy and content knowledge. Few studies take into account the social contexts of GTAs' professional development. However, GTAs perceive their social interactions with other GTAs to be a vital part of their preparation and support for teaching (e.g., Staton & Darling, 1989). Communities of practice (CoPs) are one way to bring together the study of the social contexts and structured activities of GTA professional development. CoPs are defined as groups of practitioners who deepen their knowledge and expertise by interacting with each other on an ongoing basis (e.g., Lave & Wenger, 1991). Graduate students may participate in CoPs related to teaching in many ways, including attending courses or workshops, participating in weekly meetings, engaging in informal discussions about teaching, or participating in e-mail conversations related to teaching tasks. This study explored the relationship between statistics graduate students' experiences in CoPs and the extent to which they hold student-centered teaching beliefs. A framework for characterizing GTAs' experiences in CoPs was described and a theoretical model relating these characteristics to GTAs' beliefs was developed. To gather data to test the model, the Graduate Students' Experiences Teaching Statistics (GETS) Inventory was created. Items were written to collect information about GTAs' current teaching beliefs, teaching beliefs before entering their degree programs, characteristics of GTAs' experiences in CoPs, and demographic information. Using an online program, the GETS Inventory was administered to N =218 statistics graduate students representing 37 institutions in 24 different U.S. states. The data gathered from the national survey suggest that statistics graduate students often experience CoPs through required meetings and voluntary discussions about teaching. Participants feel comfortable disagreeing with the people they perceive to be most influential on their teaching beliefs. Most participants perceive a faculty member to have the most influential role in shaping their teaching beliefs. The survey data did not provide evidence to support the proposed theoretical model relating characteristics of experiences in CoPs and beliefs about teaching statistics. Based on cross-validation results, prior beliefs about teaching statistics was the best predictor of current beliefs. Additional models were retained that included student characteristics suggested by previous literature to be associated with student-centered or traditional teaching beliefs (e.g., prior teaching experience, international student status). The results of this study can be used to inform future efforts to help promote student-centered teaching beliefs and teaching practices among statistics GTAs. Modifications to the GETS Inventory are suggested for use in future research designed to gather information about GTAs, their teaching beliefs, and their experiences in CoPs. Suggestions are also made for aspects of CoPs that might be studied further in order to learn how CoPs can promote teaching beliefs and practices that support student learning.

  6. Thermodynamic Model of Spatial Memory

    NASA Astrophysics Data System (ADS)

    Kaufman, Miron; Allen, P.

    1998-03-01

    We develop and test a thermodynamic model of spatial memory. Our model is an application of statistical thermodynamics to cognitive science. It is related to applications of the statistical mechanics framework in parallel distributed processes research. Our macroscopic model allows us to evaluate an entropy associated with spatial memory tasks. We find that older adults exhibit higher levels of entropy than younger adults. Thurstone's Law of Categorical Judgment, according to which the discriminal processes along the psychological continuum produced by presentations of a single stimulus are normally distributed, is explained by using a Hooke spring model of spatial memory. We have also analyzed a nonlinear modification of the ideal spring model of spatial memory. This work is supported by NIH/NIA grant AG09282-06.

  7. Frequency-selective fading statistics of shallow-water acoustic communication channel with a few multipaths

    NASA Astrophysics Data System (ADS)

    Bae, Minja; Park, Jihyun; Kim, Jongju; Xue, Dandan; Park, Kyu-Chil; Yoon, Jong Rak

    2016-07-01

    The bit error rate of an underwater acoustic communication system is related to multipath fading statistics, which determine the signal-to-noise ratio. The amplitude and delay of each path depend on sea surface roughness, propagation medium properties, and source-to-receiver range as a function of frequency. Therefore, received signals will show frequency-dependent fading. A shallow-water acoustic communication channel generally shows a few strong multipaths that interfere with each other and the resulting interference affects the fading statistics model. In this study, frequency-selective fading statistics are modeled on the basis of the phasor representation of the complex path amplitude. The fading statistics distribution is parameterized by the frequency-dependent constructive or destructive interference of multipaths. At a 16 m depth with a muddy bottom, a wave height of 0.2 m, and source-to-receiver ranges of 100 and 400 m, fading statistics tend to show a Rayleigh distribution at a destructive interference frequency, but a Rice distribution at a constructive interference frequency. The theoretical fading statistics well matched the experimental ones.

  8. Predicting relative species composition within mixed conifer forest pixels using zero‐inflated models and Landsat imagery

    Treesearch

    Shannon L. Savage; Rick L. Lawrence; John R. Squires

    2015-01-01

    Ecological and land management applications would often benefit from maps of relative canopy cover of each species present within a pixel, instead of traditional remote-sensing based maps of either dominant species or percent canopy cover without regard to species composition. Widely used statistical models for remote sensing, such as randomForest (RF),...

  9. Improving the Validity of Activity of Daily Living Dependency Risk Assessment

    PubMed Central

    Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

    2015-01-01

    Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867

  10. Regionalisation of statistical model outputs creating gridded data sets for Germany

    NASA Astrophysics Data System (ADS)

    Höpp, Simona Andrea; Rauthe, Monika; Deutschländer, Thomas

    2016-04-01

    The goal of the German research program ReKliEs-De (regional climate projection ensembles for Germany, http://.reklies.hlug.de) is to distribute robust information about the range and the extremes of future climate for Germany and its neighbouring river catchment areas. This joint research project is supported by the German Federal Ministry of Education and Research (BMBF) and was initiated by the German Federal States. The Project results are meant to support the development of adaptation strategies to mitigate the impacts of future climate change. The aim of our part of the project is to adapt and transfer the regionalisation methods of the gridded hydrological data set (HYRAS) from daily station data to the station based statistical regional climate model output of WETTREG (regionalisation method based on weather patterns). The WETTREG model output covers the period of 1951 to 2100 with a daily temporal resolution. For this, we generate a gridded data set of the WETTREG output for precipitation, air temperature and relative humidity with a spatial resolution of 12.5 km x 12.5 km, which is common for regional climate models. Thus, this regionalisation allows comparing statistical to dynamical climate model outputs. The HYRAS data set was developed by the German Meteorological Service within the German research program KLIWAS (www.kliwas.de) and consists of daily gridded data for Germany and its neighbouring river catchment areas. It has a spatial resolution of 5 km x 5 km for the entire domain for the hydro-meteorological elements precipitation, air temperature and relative humidity and covers the period of 1951 to 2006. After conservative remapping the HYRAS data set is also convenient for the validation of climate models. The presentation will consist of two parts to present the actual state of the adaptation of the HYRAS regionalisation methods to the statistical regional climate model WETTREG: First, an overview of the HYRAS data set and the regionalisation methods for precipitation (REGNIE method based on a combination of multiple linear regression with 5 predictors and inverse distance weighting), air temperature and relative humidity (optimal interpolation) will be given. Finally, results of the regionalisation of WETTREG model output will be shown.

  11. Statistical sensitivity analysis of a simple nuclear waste repository model

    NASA Astrophysics Data System (ADS)

    Ronen, Y.; Lucius, J. L.; Blow, E. M.

    1980-06-01

    A preliminary step in a comprehensive sensitivity analysis of the modeling of a nuclear waste repository. The purpose of the complete analysis is to determine which modeling parameters and physical data are most important in determining key design performance criteria and then to obtain the uncertainty in the design for safety considerations. The theory for a statistical screening design methodology is developed for later use in the overall program. The theory was applied to the test case of determining the relative importance of the sensitivity of near field temperature distribution in a single level salt repository to modeling parameters. The exact values of the sensitivities to these physical and modeling parameters were then obtained using direct methods of recalculation. The sensitivity coefficients found to be important for the sample problem were thermal loading, distance between the spent fuel canisters and their radius. Other important parameters were those related to salt properties at a point of interest in the repository.

  12. Optimization of Analytical Potentials for Coarse-Grained Biopolymer Models.

    PubMed

    Mereghetti, Paolo; Maccari, Giuseppe; Spampinato, Giulia Lia Beatrice; Tozzini, Valentina

    2016-08-25

    The increasing trend in the recent literature on coarse grained (CG) models testifies their impact in the study of complex systems. However, the CG model landscape is variegated: even considering a given resolution level, the force fields are very heterogeneous and optimized with very different parametrization procedures. Along the road for standardization of CG models for biopolymers, here we describe a strategy to aid building and optimization of statistics based analytical force fields and its implementation in the software package AsParaGS (Assisted Parameterization platform for coarse Grained modelS). Our method is based on the use and optimization of analytical potentials, optimized by targeting internal variables statistical distributions by means of the combination of different algorithms (i.e., relative entropy driven stochastic exploration of the parameter space and iterative Boltzmann inversion). This allows designing a custom model that endows the force field terms with a physically sound meaning. Furthermore, the level of transferability and accuracy can be tuned through the choice of statistical data set composition. The method-illustrated by means of applications to helical polypeptides-also involves the analysis of two and three variable distributions, and allows handling issues related to the FF term correlations. AsParaGS is interfaced with general-purpose molecular dynamics codes and currently implements the "minimalist" subclass of CG models (i.e., one bead per amino acid, Cα based). Extensions to nucleic acids and different levels of coarse graining are in the course.

  13. A geostatistical state-space model of animal densities for stream networks.

    PubMed

    Hocking, Daniel J; Thorson, James T; O'Neil, Kyle; Letcher, Benjamin H

    2018-06-21

    Population dynamics are often correlated in space and time due to correlations in environmental drivers as well as synchrony induced by individual dispersal. Many statistical analyses of populations ignore potential autocorrelations and assume that survey methods (distance and time between samples) eliminate these correlations, allowing samples to be treated independently. If these assumptions are incorrect, results and therefore inference may be biased and uncertainty under-estimated. We developed a novel statistical method to account for spatio-temporal correlations within dendritic stream networks, while accounting for imperfect detection in the surveys. Through simulations, we found this model decreased predictive error relative to standard statistical methods when data were spatially correlated based on stream distance and performed similarly when data were not correlated. We found that increasing the number of years surveyed substantially improved the model accuracy when estimating spatial and temporal correlation coefficients, especially from 10 to 15 years. Increasing the number of survey sites within the network improved the performance of the non-spatial model but only marginally improved the density estimates in the spatio-temporal model. We applied this model to Brook Trout data from the West Susquehanna Watershed in Pennsylvania collected over 34 years from 1981 - 2014. We found the model including temporal and spatio-temporal autocorrelation best described young-of-the-year (YOY) and adult density patterns. YOY densities were positively related to forest cover and negatively related to spring temperatures with low temporal autocorrelation and moderately-high spatio-temporal correlation. Adult densities were less strongly affected by climatic conditions and less temporally variable than YOY but with similar spatio-temporal correlation and higher temporal autocorrelation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  14. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P [Livermore, CA; Brandt, James M [Dublin, CA; Gentile, Ann C [Dublin, CA; Marzouk, Youssef M [Oakland, CA; Hale, Darrian J [San Jose, CA; Thompson, David C [Livermore, CA

    2011-01-04

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  15. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P [Livermore, CA; Brandt, James M [Dublin, CA; Gentile, Ann C [Dublin, CA; Marzouk, Youssef M [Oakland, CA; Hale, Darrian J [San Jose, CA; Thompson, David C [Livermore, CA

    2011-01-25

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  16. System and method for statistically monitoring and analyzing sensed conditions

    DOEpatents

    Pebay, Philippe P [Livermore, CA; Brandt, James M. , Gentile; Ann C. , Marzouk; Youssef M. , Hale; Darrian J. , Thompson; David, C [Livermore, CA

    2010-07-13

    A system and method of monitoring and analyzing a plurality of attributes for an alarm condition is disclosed. The attributes are processed and/or unprocessed values of sensed conditions of a collection of a statistically significant number of statistically similar components subjected to varying environmental conditions. The attribute values are used to compute the normal behaviors of some of the attributes and also used to infer parameters of a set of models. Relative probabilities of some attribute values are then computed and used along with the set of models to determine whether an alarm condition is met. The alarm conditions are used to prevent or reduce the impact of impending failure.

  17. Quantifying falsifiability of scientific theories

    NASA Astrophysics Data System (ADS)

    Nemenman, Ilya

    I argue that the notion of falsifiability, a key concept in defining a valid scientific theory, can be quantified using Bayesian Model Selection, which is a standard tool in modern statistics. This relates falsifiability to the quantitative version of the statistical Occam's razor, and allows transforming some long-running arguments about validity of scientific theories from philosophical discussions to rigorous mathematical calculations.

  18. Comparisons of Student Achievement Levels by District Performance and Poverty. ACT Research Report Series 2016-11

    ERIC Educational Resources Information Center

    Dougherty, Chrys; Shaw, Teresa

    2016-01-01

    This report looks at student achievement levels in Arkansas school districts disaggregated by district poverty and by the district's performance relative to other districts. We estimated district performance statistics by subject and grade level (4, 8, and 11-12) for longitudinal student cohorts, using statistical models that adjusted for district…

  19. The intermediates take it all: asymptotics of higher criticism statistics and a powerful alternative based on equal local levels.

    PubMed

    Gontscharuk, Veronika; Landwehr, Sandra; Finner, Helmut

    2015-01-01

    The higher criticism (HC) statistic, which can be seen as a normalized version of the famous Kolmogorov-Smirnov statistic, has a long history, dating back to the mid seventies. Originally, HC statistics were used in connection with goodness of fit (GOF) tests but they recently gained some attention in the context of testing the global null hypothesis in high dimensional data. The continuing interest for HC seems to be inspired by a series of nice asymptotic properties related to this statistic. For example, unlike Kolmogorov-Smirnov tests, GOF tests based on the HC statistic are known to be asymptotically sensitive in the moderate tails, hence it is favorably applied for detecting the presence of signals in sparse mixture models. However, some questions around the asymptotic behavior of the HC statistic are still open. We focus on two of them, namely, why a specific intermediate range is crucial for GOF tests based on the HC statistic and why the convergence of the HC distribution to the limiting one is extremely slow. Moreover, the inconsistency in the asymptotic and finite behavior of the HC statistic prompts us to provide a new HC test that has better finite properties than the original HC test while showing the same asymptotics. This test is motivated by the asymptotic behavior of the so-called local levels related to the original HC test. By means of numerical calculations and simulations we show that the new HC test is typically more powerful than the original HC test in normal mixture models. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Development of a funding, cost, and spending model for satellite projects

    NASA Technical Reports Server (NTRS)

    Johnson, Jesse P.

    1989-01-01

    The need for a predictive budget/funging model is obvious. The current models used by the Resource Analysis Office (RAO) are used to predict the total costs of satellite projects. An effort to extend the modeling capabilities from total budget analysis to total budget and budget outlays over time analysis was conducted. A statistical based and data driven methodology was used to derive and develop the model. Th budget data for the last 18 GSFC-sponsored satellite projects were analyzed and used to build a funding model which would describe the historical spending patterns. This raw data consisted of dollars spent in that specific year and their 1989 dollar equivalent. This data was converted to the standard format used by the RAO group and placed in a database. A simple statistical analysis was performed to calculate the gross statistics associated with project length and project cost ant the conditional statistics on project length and project cost. The modeling approach used is derived form the theory of embedded statistics which states that properly analyzed data will produce the underlying generating function. The process of funding large scale projects over extended periods of time is described by Life Cycle Cost Models (LCCM). The data was analyzed to find a model in the generic form of a LCCM. The model developed is based on a Weibull function whose parameters are found by both nonlinear optimization and nonlinear regression. In order to use this model it is necessary to transform the problem from a dollar/time space to a percentage of total budget/time space. This transformation is equivalent to moving to a probability space. By using the basic rules of probability, the validity of both the optimization and the regression steps are insured. This statistically significant model is then integrated and inverted. The resulting output represents a project schedule which relates the amount of money spent to the percentage of project completion.

  1. Evaluation of high-resolution sea ice models on the basis of statistical and scaling properties of Arctic sea ice drift and deformation

    NASA Astrophysics Data System (ADS)

    Girard, L.; Weiss, J.; Molines, J. M.; Barnier, B.; Bouillon, S.

    2009-08-01

    Sea ice drift and deformation from models are evaluated on the basis of statistical and scaling properties. These properties are derived from two observation data sets: the RADARSAT Geophysical Processor System (RGPS) and buoy trajectories from the International Arctic Buoy Program (IABP). Two simulations obtained with the Louvain-la-Neuve Ice Model (LIM) coupled to a high-resolution ocean model and a simulation obtained with the Los Alamos Sea Ice Model (CICE) were analyzed. Model ice drift compares well with observations in terms of large-scale velocity field and distributions of velocity fluctuations although a significant bias on the mean ice speed is noted. On the other hand, the statistical properties of ice deformation are not well simulated by the models: (1) The distributions of strain rates are incorrect: RGPS distributions of strain rates are power law tailed, i.e., exhibit "wild randomness," whereas models distributions remain in the Gaussian attraction basin, i.e., exhibit "mild randomness." (2) The models are unable to reproduce the spatial and temporal correlations of the deformation fields: In the observations, ice deformation follows spatial and temporal scaling laws that express the heterogeneity and the intermittency of deformation. These relations do not appear in simulated ice deformation. Mean deformation in models is almost scale independent. The statistical properties of ice deformation are a signature of the ice mechanical behavior. The present work therefore suggests that the mechanical framework currently used by models is inappropriate. A different modeling framework based on elastic interactions could improve the representation of the statistical and scaling properties of ice deformation.

  2. Statistical colour models: an automated digital image analysis method for quantification of histological biomarkers.

    PubMed

    Shu, Jie; Dolman, G E; Duan, Jiang; Qiu, Guoping; Ilyas, Mohammad

    2016-04-27

    Colour is the most important feature used in quantitative immunohistochemistry (IHC) image analysis; IHC is used to provide information relating to aetiology and to confirm malignancy. Statistical modelling is a technique widely used for colour detection in computer vision. We have developed a statistical model of colour detection applicable to detection of stain colour in digital IHC images. Model was first trained by massive colour pixels collected semi-automatically. To speed up the training and detection processes, we removed luminance channel, Y channel of YCbCr colour space and chose 128 histogram bins which is the optimal number. A maximum likelihood classifier is used to classify pixels in digital slides into positively or negatively stained pixels automatically. The model-based tool was developed within ImageJ to quantify targets identified using IHC and histochemistry. The purpose of evaluation was to compare the computer model with human evaluation. Several large datasets were prepared and obtained from human oesophageal cancer, colon cancer and liver cirrhosis with different colour stains. Experimental results have demonstrated the model-based tool achieves more accurate results than colour deconvolution and CMYK model in the detection of brown colour, and is comparable to colour deconvolution in the detection of pink colour. We have also demostrated the proposed model has little inter-dataset variations. A robust and effective statistical model is introduced in this paper. The model-based interactive tool in ImageJ, which can create a visual representation of the statistical model and detect a specified colour automatically, is easy to use and available freely at http://rsb.info.nih.gov/ij/plugins/ihc-toolbox/index.html . Testing to the tool by different users showed only minor inter-observer variations in results.

  3. Statistical analyses of the relative risk.

    PubMed Central

    Gart, J J

    1979-01-01

    Let P1 be the probability of a disease in one population and P2 be the probability of a disease in a second population. The ratio of these quantities, R = P1/P2, is termed the relative risk. We consider first the analyses of the relative risk from retrospective studies. The relation between the relative risk and the odds ratio (or cross-product ratio) is developed. The odds ratio can be considered a parameter of an exponential model possessing sufficient statistics. This permits the development of exact significance tests and confidence intervals in the conditional space. Unconditional tests and intervals are also considered briefly. The consequences of misclassification errors and ignoring matching or stratifying are also considered. The various methods are extended to combination of results over the strata. Examples of case-control studies testing the association between HL-A frequencies and cancer illustrate the techniques. The parallel analyses of prospective studies are given. If P1 and P2 are small with large samples sizes the appropriate model is a Poisson distribution. This yields a exponential model with sufficient statistics. Exact conditional tests and confidence intervals can then be developed. Here we consider the case where two populations are compared adjusting for sex differences as well as for the strata (or covariate) differences such as age. The methods are applied to two examples: (1) testing in the two sexes the ratio of relative risks of skin cancer in people living in different latitudes, and (2) testing over time the ratio of the relative risks of cancer in two cities, one of which fluoridated its drinking water and one which did not. PMID:540589

  4. A model to predict accommodations needed by disabled persons.

    PubMed

    Babski-Reeves, Kari; Williams, Sabrina; Waters, Tzer Nan; Crumpton-Young, Lesia L; McCauley-Bell, Pamela

    2005-09-01

    In this paper, several approaches to assist employers in the accommodation process for disabled employees are discussed and a mathematical model is proposed to assist employers in predicting the accommodation level needed by an individual with a mobility-related disability. This study investigates the validity and reliability of this model in assessing the accommodation level needed by individuals utilizing data collected from twelve individuals with mobility-related disabilities. Based on the results of the statistical analyses, this proposed model produces a feasible preliminary measure for assessing the accommodation level needed for persons with mobility-related disabilities. Suggestions for practical application of this model in an industrial setting are addressed.

  5. Discrete ellipsoidal statistical BGK model and Burnett equations

    NASA Astrophysics Data System (ADS)

    Zhang, Yu-Dong; Xu, Ai-Guo; Zhang, Guang-Cai; Chen, Zhi-Hua; Wang, Pei

    2018-06-01

    A new discrete Boltzmann model, the discrete ellipsoidal statistical Bhatnagar-Gross-Krook (ESBGK) model, is proposed to simulate nonequilibrium compressible flows. Compared with the original discrete BGK model, the discrete ES-BGK has a flexible Prandtl number. For the discrete ES-BGK model in the Burnett level, two kinds of discrete velocity model are introduced and the relations between nonequilibrium quantities and the viscous stress and heat flux in the Burnett level are established. The model is verified via four benchmark tests. In addition, a new idea is introduced to recover the actual distribution function through the macroscopic quantities and their space derivatives. The recovery scheme works not only for discrete Boltzmann simulation but also for hydrodynamic ones, for example, those based on the Navier-Stokes or the Burnett equations.

  6. Assessment of the long-lead probabilistic prediction for the Asian summer monsoon precipitation (1983-2011) based on the APCC multimodel system and a statistical model

    NASA Astrophysics Data System (ADS)

    Sohn, Soo-Jin; Min, Young-Mi; Lee, June-Yi; Tam, Chi-Yung; Kang, In-Sik; Wang, Bin; Ahn, Joong-Bae; Yamagata, Toshio

    2012-02-01

    The performance of the probabilistic multimodel prediction (PMMP) system of the APEC Climate Center (APCC) in predicting the Asian summer monsoon (ASM) precipitation at a four-month lead (with February initial condition) was compared with that of a statistical model using hindcast data for 1983-2005 and real-time forecasts for 2006-2011. Particular attention was paid to probabilistic precipitation forecasts for the boreal summer after the mature phase of El Niño and Southern Oscillation (ENSO). Taking into account the fact that coupled models' skill for boreal spring and summer precipitation mainly comes from their ability to capture ENSO teleconnection, we developed the statistical model using linear regression with the preceding winter ENSO condition as the predictor. Our results reveal several advantages and disadvantages in both forecast systems. First, the PMMP appears to have higher skills for both above- and below-normal categories in the six-year real-time forecast period, whereas the cross-validated statistical model has higher skills during the 23-year hindcast period. This implies that the cross-validated statistical skill may be overestimated. Second, the PMMP is the better tool for capturing atypical ENSO (or non-canonical ENSO related) teleconnection, which has affected the ASM precipitation during the early 1990s and in the recent decade. Third, the statistical model is more sensitive to the ENSO phase and has an advantage in predicting the ASM precipitation after the mature phase of La Niña.

  7. Evidential evaluation of DNA profiles using a discrete statistical model implemented in the DNA LiRa software.

    PubMed

    Puch-Solis, Roberto; Clayton, Tim

    2014-07-01

    The high sensitivity of the technology for producing profiles means that it has become routine to produce profiles from relatively small quantities of DNA. The profiles obtained from low template DNA (LTDNA) are affected by several phenomena which must be taken into consideration when interpreting and evaluating this evidence. Furthermore, many of the same phenomena affect profiles from higher amounts of DNA (e.g. where complex mixtures has been revealed). In this article we present a statistical model, which forms the basis of software DNA LiRa, and that is able to calculate likelihood ratios where one to four donors are postulated and for any number of replicates. The model can take into account dropin and allelic dropout for different contributors, template degradation and uncertain allele designations. In this statistical model unknown parameters are treated following the Empirical Bayesian paradigm. The performance of LiRa is tested using examples and the outputs are compared with those generated using two other statistical software packages likeLTD and LRmix. The concept of ban efficiency is introduced as a measure for assessing model sensitivity. Copyright © 2014. Published by Elsevier Ireland Ltd.

  8. Dynamical Constraints On The Galaxy-Halo Connection

    NASA Astrophysics Data System (ADS)

    Desmond, Harry

    2017-07-01

    Dark matter halos comprise the bulk of the universe's mass, yet must be probed by the luminous galaxies that form within them. A key goal of modern astrophysics, therefore, is to robustly relate the visible and dark mass, which to first order means relating the properties of galaxies and halos. This may be expected not only to improve our knowledge of galaxy formation, but also to enable high-precision cosmological tests using galaxies and hence maximise the utility of future galaxy surveys. As halos are inaccessible to observations - as galaxies are to N-body simulations - this relation requires an additional modelling step.The aim of this thesis is to develop and evaluate models of the galaxy-halo connection using observations of galaxy dynamics. In particular, I build empirical models based on the technique of halo abundance matching for five key dynamical scaling relations of galaxies - the Tully-Fisher, Faber-Jackson, mass-size and mass discrepancy-acceleration relations, and Fundamental Plane - which relate their baryon distributions and rotation or velocity dispersion profiles. I then develop a statistical scheme based on approximate Bayesian computation to compare the predicted and measured values of a number of summary statistics describing the relations' important features. This not only provides quantitative constraints on the free parameters of the models, but also allows absolute goodness-of-fit measures to be formulated. I find some features to be naturally accounted for by an abundance matching approach and others to impose new constraints on the galaxy-halo connection; the remainder are challenging to account for and may imply galaxy-halo correlations beyond the scope of basic abundance matching.Besides providing concrete statistical tests of specific galaxy formation theories, these results will be of use for guiding the inputs of empirical and semi-analytic galaxy formation models, which require galaxy-halo correlations to be imposed by hand. As galaxy datasets become larger and more precise in the future, we may expect these methods to continue providing insight into the relation between the visible and dark matter content of the universe and the physical processes that underlie it.

  9. Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.

    PubMed

    Ruhi, S; Karim, M R

    2016-01-01

    Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.

  10. Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables.

    PubMed

    Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu

    2015-06-01

    Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.

  11. Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring

    Treesearch

    Carlos Carroll; Devin S. Johnson; Jeffrey R. Dunk; William J. Zielinski

    2010-01-01

    Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data’s spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and...

  12. Modeling Zero-Inflated and Overdispersed Count Data: An Empirical Study of School Suspensions

    ERIC Educational Resources Information Center

    Desjardins, Christopher David

    2016-01-01

    The purpose of this article is to develop a statistical model that best explains variability in the number of school days suspended. Number of school days suspended is a count variable that may be zero-inflated and overdispersed relative to a Poisson model. Four models were examined: Poisson, negative binomial, Poisson hurdle, and negative…

  13. Statistical dependency in visual scanning

    NASA Technical Reports Server (NTRS)

    Ellis, Stephen R.; Stark, Lawrence

    1986-01-01

    A method to identify statistical dependencies in the positions of eye fixations is developed and applied to eye movement data from subjects who viewed dynamic displays of air traffic and judged future relative position of aircraft. Analysis of approximately 23,000 fixations on points of interest on the display identified statistical dependencies in scanning that were independent of the physical placement of the points of interest. Identification of these dependencies is inconsistent with random-sampling-based theories used to model visual search and information seeking.

  14. The influence of temperature and relative humidity on the development of Lepidoglyphus destructor (Acari: Glycyphagidae) and its production of allergens: a laboratory experiment.

    PubMed

    Danielsen, Charlotte; Hansen, Lise Stengård; Nachman, Gösta; Herling, Christian

    2004-01-01

    Laboratory experiments with Lepidoglyphus destructor on a diet of mainly whole wheat were conducted to study the mite's development and production of a specific allergen, Lep d 2, at four different temperatures (5, 10, 15 and 20 degrees C) and three levels of relative humidity (ca. 70-88%). Statistical models were used to analyse the role played by temperature, relative humidity and time in explaining the observed number of L. destructor and the amount of allergen produced. Moreover, the life stage distributions of the mites were determined and related to the population growth. Based on a statistical model the intrinsic rate of natural increase, rm, was computed for a range of different temperatures and relative humidities. High relative humidity in combination with temperatures at about 25 degrees C will lead to the highest rm (ca. 0.15 day-1). The highest concentration of Lep d 2 was 3 micrograms g-1 grain, found at 20 degrees C and high relative humidity at a mite density of 254 mites g-1 grain. The concentration of allergens in the grain was best explained by a model that incorporated both the current and the cumulative numbers of mites.

  15. Differences and discriminatory power of water polo game-related statistics in men in international championships and their relationship with the phase of the competition.

    PubMed

    Escalante, Yolanda; Saavedra, Jose M; Tella, Victor; Mansilla, Mirella; García-Hermoso, Antonio; Domínguez, Ana M

    2013-04-01

    The aims of this study were (a) to compare water polo game-related statistics by context (winning and losing teams) and phase (preliminary, classification, and semifinal/bronze medal/gold medal), and (b) identify characteristics that discriminate performances for each phase. The game-related statistics of the 230 men's matches played in World Championships (2007, 2009, and 2011) and European Championships (2008 and 2010) were analyzed. Differences between contexts (winning or losing teams) in each phase (preliminary, classification, and semifinal/bronze medal/gold medal) were determined using the chi-squared statistic, also calculating the effect sizes of the differences. A discriminant analysis was then performed after the sample-splitting method according to context (winning and losing teams) in each of the 3 phases. It was found that the game-related statistics differentiate the winning from the losing teams in each phase of an international championship. The differentiating variables are both offensive and defensive, including action shots, sprints, goalkeeper-blocked shots, and goalkeeper-blocked action shots. However, the number of discriminatory variables decreases as the phase becomes more demanding and the teams become more equally matched. The discriminant analysis showed the game-related statistics to discriminate performance in all phases (preliminary, classificatory, and semifinal/bronze medal/gold medal phase) with high percentages (91, 90, and 73%, respectively). Again, the model selected both defensive and offensive variables.

  16. Statistical mapping of count survey data

    USGS Publications Warehouse

    Royle, J. Andrew; Link, W.A.; Sauer, J.R.; Scott, J. Michael; Heglund, Patricia J.; Morrison, Michael L.; Haufler, Jonathan B.; Wall, William A.

    2002-01-01

    We apply a Poisson mixed model to the problem of mapping (or predicting) bird relative abundance from counts collected from the North American Breeding Bird Survey (BBS). The model expresses the logarithm of the Poisson mean as a sum of a fixed term (which may depend on habitat variables) and a random effect which accounts for remaining unexplained variation. The random effect is assumed to be spatially correlated, thus providing a more general model than the traditional Poisson regression approach. Consequently, the model is capable of improved prediction when data are autocorrelated. Moreover, formulation of the mapping problem in terms of a statistical model facilitates a wide variety of inference problems which are cumbersome or even impossible using standard methods of mapping. For example, assessment of prediction uncertainty, including the formal comparison of predictions at different locations, or through time, using the model-based prediction variance is straightforward under the Poisson model (not so with many nominally model-free methods). Also, ecologists may generally be interested in quantifying the response of a species to particular habitat covariates or other landscape attributes. Proper accounting for the uncertainty in these estimated effects is crucially dependent on specification of a meaningful statistical model. Finally, the model may be used to aid in sampling design, by modifying the existing sampling plan in a manner which minimizes some variance-based criterion. Model fitting under this model is carried out using a simulation technique known as Markov Chain Monte Carlo. Application of the model is illustrated using Mourning Dove (Zenaida macroura) counts from Pennsylvania BBS routes. We produce both a model-based map depicting relative abundance, and the corresponding map of prediction uncertainty. We briefly address the issue of spatial sampling design under this model. Finally, we close with some discussion of mapping in relation to habitat structure. Although our models were fit in the absence of habitat information, the resulting predictions show a strong inverse relation with a map of forest cover in the state, as expected. Consequently, the results suggest that the correlated random effect in the model is broadly representing ecological variation, and that BBS data may be generally useful for studying bird-habitat relationships, even in the presence of observer errors and other widely recognized deficiencies of the BBS.

  17. North American extreme temperature events and related large scale meteorological patterns: A review of statistical methods, dynamics, modeling, and trends

    DOE PAGES

    Grotjahn, Richard; Black, Robert; Leung, Ruby; ...

    2015-05-22

    This paper reviews research approaches and open questions regarding data, statistical analyses, dynamics, modeling efforts, and trends in relation to temperature extremes. Our specific focus is upon extreme events of short duration (roughly less than 5 days) that affect parts of North America. These events are associated with large scale meteorological patterns (LSMPs). Methods used to define extreme events statistics and to identify and connect LSMPs to extreme temperatures are presented. Recent advances in statistical techniques can connect LSMPs to extreme temperatures through appropriately defined covariates that supplements more straightforward analyses. A wide array of LSMPs, ranging from synoptic tomore » planetary scale phenomena, have been implicated as contributors to extreme temperature events. Current knowledge about the physical nature of these contributions and the dynamical mechanisms leading to the implicated LSMPs is incomplete. There is a pressing need for (a) systematic study of the physics of LSMPs life cycles and (b) comprehensive model assessment of LSMP-extreme temperature event linkages and LSMP behavior. Generally, climate models capture the observed heat waves and cold air outbreaks with some fidelity. However they overestimate warm wave frequency and underestimate cold air outbreaks frequency, and underestimate the collective influence of low-frequency modes on temperature extremes. Climate models have been used to investigate past changes and project future trends in extreme temperatures. Overall, modeling studies have identified important mechanisms such as the effects of large-scale circulation anomalies and land-atmosphere interactions on changes in extreme temperatures. However, few studies have examined changes in LSMPs more specifically to understand the role of LSMPs on past and future extreme temperature changes. Even though LSMPs are resolvable by global and regional climate models, they are not necessarily well simulated so more research is needed to understand the limitations of climate models and improve model skill in simulating extreme temperatures and their associated LSMPs. Furthermore, the paper concludes with unresolved issues and research questions.« less

  18. The volume-mortality relation for radical cystectomy in England: retrospective analysis of hospital episode statistics

    PubMed Central

    Bottle, Alex; Darzi, Ara W; Athanasiou, Thanos; Vale, Justin A

    2010-01-01

    Objectives To investigate the relation between volume and mortality after adjustment for case mix for radical cystectomy in the English healthcare setting using improved statistical methodology, taking into account the institutional and surgeon volume effects and institutional structural and process of care factors. Design Retrospective analysis of hospital episode statistics using multilevel modelling. Setting English hospitals carrying out radical cystectomy in the seven financial years 2000/1 to 2006/7. Participants Patients with a primary diagnosis of cancer undergoing an inpatient elective cystectomy. Main outcome measure Mortality within 30 days of cystectomy. Results Compared with low volume institutions, medium volume ones had a significantly higher odds of in-hospital and total mortality: odds ratio 1.72 (95% confidence interval 1.00 to 2.98, P=0.05) and 1.82 (1.08 to 3.06, P=0.02). This was only seen in the final model, which included adjustment for structural and processes of care factors. The surgeon volume-mortality relation showed weak evidence of reduced odds of in-hospital mortality (by 35%) for the high volume surgeons, although this did not reach statistical significance at the 5% level. Conclusions The relation between case volume and mortality after radical cystectomy for bladder cancer became evident only after adjustment for structural and process of care factors, including staffing levels of nurses and junior doctors, in addition to case mix. At least for this relatively uncommon procedure, adjusting for these confounders when examining the volume-outcome relation is critical before considering centralisation of care to a few specialist institutions. Outcomes other than mortality, such as functional morbidity and disease recurrence may ultimately influence towards centralising care. PMID:20305302

  19. The potential of composite cognitive scores for tracking progression in Huntington's disease.

    PubMed

    Jones, Rebecca; Stout, Julie C; Labuschagne, Izelle; Say, Miranda; Justo, Damian; Coleman, Allison; Dumas, Eve M; Hart, Ellen; Owen, Gail; Durr, Alexandra; Leavitt, Blair R; Roos, Raymund; O'Regan, Alison; Langbehn, Doug; Tabrizi, Sarah J; Frost, Chris

    2014-01-01

    Composite scores derived from joint statistical modelling of individual risk factors are widely used to identify individuals who are at increased risk of developing disease or of faster disease progression. We investigated the ability of composite measures developed using statistical models to differentiate progressive cognitive deterioration in Huntington's disease (HD) from natural decline in healthy controls. Using longitudinal data from TRACK-HD, the optimal combinations of quantitative cognitive measures to differentiate premanifest and early stage HD individuals respectively from controls was determined using logistic regression. Composite scores were calculated from the parameters of each statistical model. Linear regression models were used to calculate effect sizes (ES) quantifying the difference in longitudinal change over 24 months between premanifest and early stage HD groups respectively and controls. ES for the composites were compared with ES for individual cognitive outcomes and other measures used in HD research. The 0.632 bootstrap was used to eliminate biases which result from developing and testing models in the same sample. In early HD, the composite score from the HD change prediction model produced an ES for difference in rate of 24-month change relative to controls of 1.14 (95% CI: 0.90 to 1.39), larger than the ES for any individual cognitive outcome and UHDRS Total Motor Score and Total Functional Capacity. In addition, this composite gave a statistically significant difference in rate of change in premanifest HD compared to controls over 24-months (ES: 0.24; 95% CI: 0.04 to 0.44), even though none of the individual cognitive outcomes produced statistically significant ES over this period. Composite scores developed using appropriate statistical modelling techniques have the potential to materially reduce required sample sizes for randomised controlled trials.

  20. Relations Between Environmental and Water-Quality Variables and Escherichia coli in the Cuyahoga River With Emphasis on Turbidity as a Predictor of Recreational Water Quality, Cuyahoga Valley National Park, Ohio, 2008

    USGS Publications Warehouse

    Brady, Amie M.G.; Plona, Meg B.

    2009-01-01

    During the recreational season of 2008 (May through August), a regression model relating turbidity to concentrations of Escherichia coli (E. coli) was used to predict recreational water quality in the Cuyahoga River at the historical community of Jaite, within the present city of Brecksville, Ohio, a site centrally located within Cuyahoga Valley National Park. Samples were collected three days per week at Jaite and at three other sites on the river. Concentrations of E. coli were determined and compared to environmental and water-quality measures and to concentrations predicted with a regression model. Linear relations between E. coli concentrations and turbidity, gage height, and rainfall were statistically significant for Jaite. Relations between E. coli concentrations and turbidity were statistically significant for the three additional sites, and relations between E. coli concentrations and gage height were significant at the two sites where gage-height data were available. The turbidity model correctly predicted concentrations of E. coli above or below Ohio's single-sample standard for primary-contact recreation for 77 percent of samples collected at Jaite.

  1. Statistical properties of exciton fine structure splitting and polarization angles in quantum dot ensembles

    NASA Astrophysics Data System (ADS)

    Gong, Ming; Hofer, B.; Zallo, E.; Trotta, R.; Luo, Jun-Wei; Schmidt, O. G.; Zhang, Chuanwei

    2014-05-01

    We develop an effective model to describe the statistical properties of exciton fine structure splitting (FSS) and polarization angle in quantum dot ensembles (QDEs) using only a few symmetry-related parameters. The connection between the effective model and the random matrix theory is established. Such effective model is verified both theoretically and experimentally using several rather different types of QDEs, each of which contains hundreds to thousands of QDs. The model naturally addresses three fundamental issues regarding the FSS and polarization angels of QDEs, which are frequently encountered in both theories and experiments. The answers to these fundamental questions yield an approach to characterize the optical properties of QDEs. Potential applications of the effective model are also discussed.

  2. Illness-death model: statistical perspective and differential equations.

    PubMed

    Brinks, Ralph; Hoyer, Annika

    2018-01-27

    The aim of this work is to relate the theory of stochastic processes with the differential equations associated with multistate (compartment) models. We show that the Kolmogorov Forward Differential Equations can be used to derive a relation between the prevalence and the transition rates in the illness-death model. Then, we prove mathematical well-definedness and epidemiological meaningfulness of the prevalence of the disease. As an application, we derive the incidence of diabetes from a series of cross-sections.

  3. A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis.

    PubMed

    Gonzalez, Oscar; MacKinnon, David P

    Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to an outcome. However, current methods do not allow researchers to study the relationships between general and specific aspects of a construct to an outcome simultaneously. This study proposes a bifactor measurement model for the mediating construct as a way to parse variance and represent the general aspect and specific facets of a construct simultaneously. Monte Carlo simulation results are presented to help determine the properties of mediated effect estimation when the mediator has a bifactor structure and a specific facet of a construct is the true mediator. This study also investigates the conditions when researchers can detect the mediated effect when the multidimensionality of the mediator is ignored and treated as unidimensional. Simulation results indicated that the mediation model with a bifactor mediator measurement model had unbiased and adequate power to detect the mediated effect with a sample size greater than 500 and medium a - and b -paths. Also, results indicate that parameter bias and detection of the mediated effect in both the data-generating model and the misspecified model varies as a function of the amount of facet variance represented in the mediation model. This study contributes to the largely unexplored area of measurement issues in statistical mediation analysis.

  4. How Do Microphysical Processes Influence Large-Scale Precipitation Variability and Extremes?

    DOE PAGES

    Hagos, Samson; Ruby Leung, L.; Zhao, Chun; ...

    2018-02-10

    Convection permitting simulations using the Model for Prediction Across Scales-Atmosphere (MPAS-A) are used to examine how microphysical processes affect large-scale precipitation variability and extremes. An episode of the Madden-Julian Oscillation is simulated using MPAS-A with a refined region at 4-km grid spacing over the Indian Ocean. It is shown that cloud microphysical processes regulate the precipitable water (PW) statistics. Because of the non-linear relationship between precipitation and PW, PW exceeding a certain critical value (PWcr) contributes disproportionately to precipitation variability. However, the frequency of PW exceeding PWcr decreases rapidly with PW, so changes in microphysical processes that shift the columnmore » PW statistics relative to PWcr even slightly have large impacts on precipitation variability. Furthermore, precipitation variance and extreme precipitation frequency are approximately linearly related to the difference between the mean and critical PW values. Thus observed precipitation statistics could be used to directly constrain model microphysical parameters as this study demonstrates using radar observations from DYNAMO field campaign.« less

  5. A phylogenetic transform enhances analysis of compositional microbiota data

    PubMed Central

    Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A

    2017-01-01

    Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities. DOI: http://dx.doi.org/10.7554/eLife.21887.001 PMID:28198697

  6. How Do Microphysical Processes Influence Large-Scale Precipitation Variability and Extremes?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hagos, Samson; Ruby Leung, L.; Zhao, Chun

    Convection permitting simulations using the Model for Prediction Across Scales-Atmosphere (MPAS-A) are used to examine how microphysical processes affect large-scale precipitation variability and extremes. An episode of the Madden-Julian Oscillation is simulated using MPAS-A with a refined region at 4-km grid spacing over the Indian Ocean. It is shown that cloud microphysical processes regulate the precipitable water (PW) statistics. Because of the non-linear relationship between precipitation and PW, PW exceeding a certain critical value (PWcr) contributes disproportionately to precipitation variability. However, the frequency of PW exceeding PWcr decreases rapidly with PW, so changes in microphysical processes that shift the columnmore » PW statistics relative to PWcr even slightly have large impacts on precipitation variability. Furthermore, precipitation variance and extreme precipitation frequency are approximately linearly related to the difference between the mean and critical PW values. Thus observed precipitation statistics could be used to directly constrain model microphysical parameters as this study demonstrates using radar observations from DYNAMO field campaign.« less

  7. Mapping irrigated lands at 250-m scale by merging MODIS data and National Agricultural Statistics

    USGS Publications Warehouse

    Pervez, Md Shahriar; Brown, Jesslyn F.

    2010-01-01

    Accurate geospatial information on the extent of irrigated land improves our understanding of agricultural water use, local land surface processes, conservation or depletion of water resources, and components of the hydrologic budget. We have developed a method in a geospatial modeling framework that assimilates irrigation statistics with remotely sensed parameters describing vegetation growth conditions in areas with agricultural land cover to spatially identify irrigated lands at 250-m cell size across the conterminous United States for 2002. The geospatial model result, known as the Moderate Resolution Imaging Spectroradiometer (MODIS) Irrigated Agriculture Dataset (MIrAD-US), identified irrigated lands with reasonable accuracy in California and semiarid Great Plains states with overall accuracies of 92% and 75% and kappa statistics of 0.75 and 0.51, respectively. A quantitative accuracy assessment of MIrAD-US for the eastern region has not yet been conducted, and qualitative assessment shows that model improvements are needed for the humid eastern regions where the distinction in annual peak NDVI between irrigated and non-irrigated crops is minimal and county sizes are relatively small. This modeling approach enables consistent mapping of irrigated lands based upon USDA irrigation statistics and should lead to better understanding of spatial trends in irrigated lands across the conterminous United States. An improved version of the model with revised datasets is planned and will employ 2007 USDA irrigation statistics.

  8. Correcting evaluation bias of relational classifiers with network cross validation

    DOE PAGES

    Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...

    2011-01-04

    Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less

  9. Statistical Systems with Z

    NASA Astrophysics Data System (ADS)

    William, Peter

    In this dissertation several two dimensional statistical systems exhibiting discrete Z(n) symmetries are studied. For this purpose a newly developed algorithm to compute the partition function of these models exactly is utilized. The zeros of the partition function are examined in order to obtain information about the observable quantities at the critical point. This occurs in the form of critical exponents of the order parameters which characterize phenomena at the critical point. The correlation length exponent is found to agree very well with those computed from strong coupling expansions for the mass gap and with Monte Carlo results. In Feynman's path integral formalism the partition function of a statistical system can be related to the vacuum expectation value of the time ordered product of the observable quantities of the corresponding field theoretic model. Hence a generalization of ordinary scale invariance in the form of conformal invariance is focussed upon. This principle is very suitably applicable, in the case of two dimensional statistical models undergoing second order phase transitions at criticality. The conformal anomaly specifies the universality class to which these models belong. From an evaluation of the partition function, the free energy at criticality is computed, to determine the conformal anomaly of these models. The conformal anomaly for all the models considered here are in good agreement with the predicted values.

  10. Lung Cancer Risk Prediction Model Incorporating Lung Function: Development and Validation in the UK Biobank Prospective Cohort Study.

    PubMed

    Muller, David C; Johansson, Mattias; Brennan, Paul

    2017-03-10

    Purpose Several lung cancer risk prediction models have been developed, but none to date have assessed the predictive ability of lung function in a population-based cohort. We sought to develop and internally validate a model incorporating lung function using data from the UK Biobank prospective cohort study. Methods This analysis included 502,321 participants without a previous diagnosis of lung cancer, predominantly between 40 and 70 years of age. We used flexible parametric survival models to estimate the 2-year probability of lung cancer, accounting for the competing risk of death. Models included predictors previously shown to be associated with lung cancer risk, including sex, variables related to smoking history and nicotine addiction, medical history, family history of lung cancer, and lung function (forced expiratory volume in 1 second [FEV1]). Results During accumulated follow-up of 1,469,518 person-years, there were 738 lung cancer diagnoses. A model incorporating all predictors had excellent discrimination (concordance (c)-statistic [95% CI] = 0.85 [0.82 to 0.87]). Internal validation suggested that the model will discriminate well when applied to new data (optimism-corrected c-statistic = 0.84). The full model, including FEV1, also had modestly superior discriminatory power than one that was designed solely on the basis of questionnaire variables (c-statistic = 0.84 [0.82 to 0.86]; optimism-corrected c-statistic = 0.83; p FEV1 = 3.4 × 10 -13 ). The full model had better discrimination than standard lung cancer screening eligibility criteria (c-statistic = 0.66 [0.64 to 0.69]). Conclusion A risk prediction model that includes lung function has strong predictive ability, which could improve eligibility criteria for lung cancer screening programs.

  11. A New Method to Compare Statistical Tree Growth Curves: The PL-GMANOVA Model and Its Application with Dendrochronological Data

    PubMed Central

    Ricker, Martin; Peña Ramírez, Víctor M.; von Rosen, Dietrich

    2014-01-01

    Growth curves are monotonically increasing functions that measure repeatedly the same subjects over time. The classical growth curve model in the statistical literature is the Generalized Multivariate Analysis of Variance (GMANOVA) model. In order to model the tree trunk radius (r) over time (t) of trees on different sites, GMANOVA is combined here with the adapted PL regression model Q = A·T+E, where for and for , A =  initial relative growth to be estimated, , and E is an error term for each tree and time point. Furthermore, Ei[–b·r]  = , , with TPR being the turning point radius in a sigmoid curve, and at is an estimated calibrating time-radius point. Advantages of the approach are that growth rates can be compared among growth curves with different turning point radiuses and different starting points, hidden outliers are easily detectable, the method is statistically robust, and heteroscedasticity of the residuals among time points is allowed. The model was implemented with dendrochronological data of 235 Pinus montezumae trees on ten Mexican volcano sites to calculate comparison intervals for the estimated initial relative growth . One site (at the Popocatépetl volcano) stood out, with being 3.9 times the value of the site with the slowest-growing trees. Calculating variance components for the initial relative growth, 34% of the growth variation was found among sites, 31% among trees, and 35% over time. Without the Popocatépetl site, the numbers changed to 7%, 42%, and 51%. Further explanation of differences in growth would need to focus on factors that vary within sites and over time. PMID:25402427

  12. Statistical-mechanics theory of active mode locking with noise.

    PubMed

    Gordon, Ariel; Fischer, Baruch

    2004-05-01

    Actively mode-locked lasers with noise are studied employing statistical mechanics. A mapping of the system to the spherical model (related to the Ising model) of ferromagnets in one dimension that has an exact solution is established. It gives basic features, such as analytical expressions for the correlation function between modes, and the widths and shapes of the pulses [different from the Kuizenga-Siegman expression; IEEE J. Quantum Electron. QE-6, 803 (1970)] and reveals the susceptibility to noise of mode ordering compared with passive mode locking.

  13. Evolution of cosmic string networks

    NASA Technical Reports Server (NTRS)

    Albrecht, Andreas; Turok, Neil

    1989-01-01

    A discussion of the evolution and observable consequences of a network of cosmic strings is given. A simple model for the evolution of the string network is presented, and related to the statistical mechanics of string networks. The model predicts the long string density throughout the history of the universe from a single parameter, which researchers calculate in radiation era simulations. The statistical mechanics arguments indicate a particular thermal form for the spectrum of loops chopped off the network. Detailed numerical simulations of string networks in expanding backgrounds are performed to test the model. Consequences for large scale structure, the microwave and gravity wave backgrounds, nucleosynthesis and gravitational lensing are calculated.

  14. Invariance in the recurrence of large returns and the validation of models of price dynamics

    NASA Astrophysics Data System (ADS)

    Chang, Lo-Bin; Geman, Stuart; Hsieh, Fushing; Hwang, Chii-Ruey

    2013-08-01

    Starting from a robust, nonparametric definition of large returns (“excursions”), we study the statistics of their occurrences, focusing on the recurrence process. The empirical waiting-time distribution between excursions is remarkably invariant to year, stock, and scale (return interval). This invariance is related to self-similarity of the marginal distributions of returns, but the excursion waiting-time distribution is a function of the entire return process and not just its univariate probabilities. Generalized autoregressive conditional heteroskedasticity (GARCH) models, market-time transformations based on volume or trades, and generalized (Lévy) random-walk models all fail to fit the statistical structure of excursions.

  15. Design of a testing strategy using non-animal based test methods: lessons learnt from the ACuteTox project.

    PubMed

    Kopp-Schneider, Annette; Prieto, Pilar; Kinsner-Ovaskainen, Agnieszka; Stanzel, Sven

    2013-06-01

    In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed as a single-step procedure, usually called a test battery, using simultaneously all information collected on different endpoints, or as tiered approach in which a decision tree is followed. Design of a testing strategy involves statistical considerations, such as the development of a statistical prediction model. During the EU FP6 ACuteTox project, several prediction models were proposed on the basis of statistical classification algorithms which we illustrate here. The final choice of testing strategies was not based on statistical considerations alone. However, without thorough statistical evaluations a testing strategy cannot be identified. We present here a number of observations made from the statistical viewpoint which relate to the development of testing strategies. The points we make were derived from problems we had to deal with during the evaluation of this large research project. A central issue during the development of a prediction model is the danger of overfitting. Procedures are presented to deal with this challenge. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Modeling normal shock velocity curvature relations for heterogeneous explosives

    NASA Astrophysics Data System (ADS)

    Yoo, Sunhee; Crochet, Michael; Pemberton, Steven

    2017-01-01

    The theory of Detonation Shock Dynamics (DSD) is, in part, an asymptotic method to model a functional form of the relation between the shock normal, its time rate and shock curvature κ. In addition, the shock polar analysis provides a relation between shock angle θ and the detonation velocity Dn that is dependent on the equations of state (EOS) of two adjacent materials. For the axial detonation of an explosive material confined by a cylinder, the shock angle is defined as the angle between the shock normal and the normal to the cylinder liner, located at the intersection of the shock front and cylinder inner wall. Therefore, given an ideal explosive such as PBX-9501 with two functional models determined, a unique, smooth detonation front shape ψ can be determined that approximates the steady state detonation shock front of the explosive. However, experimental measurements of the Dn(κ) relation for heterogeneous explosives such as PBXN-111 [D. K. Kennedy, 2000] are challenging due to the non-smoothness and asymmetry usually observed in the experimental streak records of explosion fronts. Out of many possibilities the asymmetric character may be attributed to the heterogeneity of the explosives; here, material heterogeneity refers to compositions with multiple components and having a grain morphology that can be modeled statistically. Therefore in extending the formulation of DSD to modern novel explosives, we pose two questions: (1) is there any simple hydrodynamic model that can simulate such an asymmetric shock evolution, and (2) what statistics can be derived for the asymmetry using simulations with defined structural heterogeneity in the unreacted explosive? Saenz, Taylor and Stewart [1] studied constitutive models for derivation of the Dn(κ) relation for porous homogeneous explosives and carried out simulations in a spherical coordinate frame. In this paper we extend their model to account for heterogeneity and present shock evolutions in heterogeneous explosives using 2-D hydrodynamic simulations with some statistical examination. As an initial work, we assume that the heterogeneity comes from the local density variation or porosity only.

  17. Modified retrieval algorithm for three types of precipitation distribution using x-band synthetic aperture radar

    NASA Astrophysics Data System (ADS)

    Xie, Yanan; Zhou, Mingliang; Pan, Dengke

    2017-10-01

    The forward-scattering model is introduced to describe the response of normalized radar cross section (NRCS) of precipitation with synthetic aperture radar (SAR). Since the distribution of near-surface rainfall is related to the rate of near-surface rainfall and horizontal distribution factor, a retrieval algorithm called modified regression empirical and model-oriented statistical (M-M) based on the volterra integration theory is proposed. Compared with the model-oriented statistical and volterra integration (MOSVI) algorithm, the biggest difference is that the M-M algorithm is based on the modified regression empirical algorithm rather than the linear regression formula to retrieve the value of near-surface rainfall rate. Half of the empirical parameters are reduced in the weighted integral work and a smaller average relative error is received while the rainfall rate is less than 100 mm/h. Therefore, the algorithm proposed in this paper can obtain high-precision rainfall information.

  18. A systematic review of Bayesian articles in psychology: The last 25 years.

    PubMed

    van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah

    2017-06-01

    Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  19. Survival Regression Modeling Strategies in CVD Prediction.

    PubMed

    Barkhordari, Mahnaz; Padyab, Mojgan; Sardarinia, Mahsa; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza

    2016-04-01

    A fundamental part of prevention is prediction. Potential predictors are the sine qua non of prediction models. However, whether incorporating novel predictors to prediction models could be directly translated to added predictive value remains an area of dispute. The difference between the predictive power of a predictive model with (enhanced model) and without (baseline model) a certain predictor is generally regarded as an indicator of the predictive value added by that predictor. Indices such as discrimination and calibration have long been used in this regard. Recently, the use of added predictive value has been suggested while comparing the predictive performances of the predictive models with and without novel biomarkers. User-friendly statistical software capable of implementing novel statistical procedures is conspicuously lacking. This shortcoming has restricted implementation of such novel model assessment methods. We aimed to construct Stata commands to help researchers obtain the aforementioned statistical indices. We have written Stata commands that are intended to help researchers obtain the following. 1, Nam-D'Agostino X 2 goodness of fit test; 2, Cut point-free and cut point-based net reclassification improvement index (NRI), relative absolute integrated discriminatory improvement index (IDI), and survival-based regression analyses. We applied the commands to real data on women participating in the Tehran lipid and glucose study (TLGS) to examine if information relating to a family history of premature cardiovascular disease (CVD), waist circumference, and fasting plasma glucose can improve predictive performance of Framingham's general CVD risk algorithm. The command is adpredsurv for survival models. Herein we have described the Stata package "adpredsurv" for calculation of the Nam-D'Agostino X 2 goodness of fit test as well as cut point-free and cut point-based NRI, relative and absolute IDI, and survival-based regression analyses. We hope this work encourages the use of novel methods in examining predictive capacity of the emerging plethora of novel biomarkers.

  20. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  1. MODELING FISH AND SHELLFISH DISTRIBUTIONS IN THE MOBILE BAY ESTUARY, USA

    EPA Science Inventory

    Estuaries in the Gulf of Mexico provide rich habitat for many fish and shellfish, including those that have been identified as economically and ecologically important. For the Mobile Bay estuary, we developed statistical models to relate distributions of individual species and sp...

  2. Data-driven modeling of background and mine-related acidity and metals in river basins

    USGS Publications Warehouse

    Friedel, Michael J

    2013-01-01

    A novel application of self-organizing map (SOM) and multivariate statistical techniques is used to model the nonlinear interaction among basin mineral-resources, mining activity, and surface-water quality. First, the SOM is trained using sparse measurements from 228 sample sites in the Animas River Basin, Colorado. The model performance is validated by comparing stochastic predictions of basin-alteration assemblages and mining activity at 104 independent sites. The SOM correctly predicts (>98%) the predominant type of basin hydrothermal alteration and presence (or absence) of mining activity. Second, application of the Davies–Bouldin criteria to k-means clustering of SOM neurons identified ten unique environmental groups. Median statistics of these groups define a nonlinear water-quality response along the spatiotemporal hydrothermal alteration-mining gradient. These results reveal that it is possible to differentiate among the continuum between inputs of background and mine-related acidity and metals, and it provides a basis for future research and empirical model development.

  3. A Virtual Study of Grid Resolution on Experiments of a Highly-Resolved Turbulent Plume

    NASA Astrophysics Data System (ADS)

    Maisto, Pietro M. F.; Marshall, Andre W.; Gollner, Michael J.; Fire Protection Engineering Department Collaboration

    2017-11-01

    An accurate representation of sub-grid scale turbulent mixing is critical for modeling fire plumes and smoke transport. In this study, PLIF and PIV diagnostics are used with the saltwater modeling technique to provide highly-resolved instantaneous field measurements in unconfined turbulent plumes useful for statistical analysis, physical insight, and model validation. The effect of resolution was investigated employing a virtual interrogation window (of varying size) applied to the high-resolution field measurements. Motivated by LES low-pass filtering concepts, the high-resolution experimental data in this study can be analyzed within the interrogation windows (i.e. statistics at the sub-grid scale) and on interrogation windows (i.e. statistics at the resolved scale). A dimensionless resolution threshold (L/D*) criterion was determined to achieve converged statistics on the filtered measurements. Such a criterion was then used to establish the relative importance between large and small-scale turbulence phenomena while investigating specific scales for the turbulent flow. First order data sets start to collapse at a resolution of 0.3D*, while for second and higher order statistical moments the interrogation window size drops down to 0.2D*.

  4. On the Benefits of Latent Variable Modeling for Norming Scales: The Case of the "Supports Intensity Scale-Children's Version"

    ERIC Educational Resources Information Center

    Seo, Hyojeong; Little, Todd D.; Shogren, Karrie A.; Lang, Kyle M.

    2016-01-01

    Structural equation modeling (SEM) is a powerful and flexible analytic tool to model latent constructs and their relations with observed variables and other constructs. SEM applications offer advantages over classical models in dealing with statistical assumptions and in adjusting for measurement error. So far, however, SEM has not been fully used…

  5. On the Spike Train Variability Characterized by Variance-to-Mean Power Relationship.

    PubMed

    Koyama, Shinsuke

    2015-07-01

    We propose a statistical method for modeling the non-Poisson variability of spike trains observed in a wide range of brain regions. Central to our approach is the assumption that the variance and the mean of interspike intervals are related by a power function characterized by two parameters: the scale factor and exponent. It is shown that this single assumption allows the variability of spike trains to have an arbitrary scale and various dependencies on the firing rate in the spike count statistics, as well as in the interval statistics, depending on the two parameters of the power function. We also propose a statistical model for spike trains that exhibits the variance-to-mean power relationship. Based on this, a maximum likelihood method is developed for inferring the parameters from rate-modulated spike trains. The proposed method is illustrated on simulated and experimental spike trains.

  6. A Bayesian Approach to Evaluating Consistency between Climate Model Output and Observations

    NASA Astrophysics Data System (ADS)

    Braverman, A. J.; Cressie, N.; Teixeira, J.

    2010-12-01

    Like other scientific and engineering problems that involve physical modeling of complex systems, climate models can be evaluated and diagnosed by comparing their output to observations of similar quantities. Though the global remote sensing data record is relatively short by climate research standards, these data offer opportunities to evaluate model predictions in new ways. For example, remote sensing data are spatially and temporally dense enough to provide distributional information that goes beyond simple moments to allow quantification of temporal and spatial dependence structures. In this talk, we propose a new method for exploiting these rich data sets using a Bayesian paradigm. For a collection of climate models, we calculate posterior probabilities its members best represent the physical system each seeks to reproduce. The posterior probability is based on the likelihood that a chosen summary statistic, computed from observations, would be obtained when the model's output is considered as a realization from a stochastic process. By exploring how posterior probabilities change with different statistics, we may paint a more quantitative and complete picture of the strengths and weaknesses of the models relative to the observations. We demonstrate our method using model output from the CMIP archive, and observations from NASA's Atmospheric Infrared Sounder.

  7. Joint inversion of marine seismic AVA and CSEM data using statistical rock-physics models and Markov random fields: Stochastic inversion of AVA and CSEM data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, J.; Hoversten, G.M.

    2011-09-15

    Joint inversion of seismic AVA and CSEM data requires rock-physics relationships to link seismic attributes to electrical properties. Ideally, we can connect them through reservoir parameters (e.g., porosity and water saturation) by developing physical-based models, such as Gassmann’s equations and Archie’s law, using nearby borehole logs. This could be difficult in the exploration stage because information available is typically insufficient for choosing suitable rock-physics models and for subsequently obtaining reliable estimates of the associated parameters. The use of improper rock-physics models and the inaccuracy of the estimates of model parameters may cause misleading inversion results. Conversely, it is easy tomore » derive statistical relationships among seismic and electrical attributes and reservoir parameters from distant borehole logs. In this study, we develop a Bayesian model to jointly invert seismic AVA and CSEM data for reservoir parameter estimation using statistical rock-physics models; the spatial dependence of geophysical and reservoir parameters are carried out by lithotypes through Markov random fields. We apply the developed model to a synthetic case, which simulates a CO{sub 2} monitoring application. We derive statistical rock-physics relations from borehole logs at one location and estimate seismic P- and S-wave velocity ratio, acoustic impedance, density, electrical resistivity, lithotypes, porosity, and water saturation at three different locations by conditioning to seismic AVA and CSEM data. Comparison of the inversion results with their corresponding true values shows that the correlation-based statistical rock-physics models provide significant information for improving the joint inversion results.« less

  8. Cure of cancer for seven cancer sites in the Flemish Region.

    PubMed

    Silversmit, Geert; Jegou, David; Vaes, Evelien; Van Hoof, Elke; Goetghebeur, Els; Van Eycken, Liesbet

    2017-03-01

    Cumulative relative survival curves for many cancers reach a plateau several years after diagnosis, indicating that the cancer survivor group has reached "statistical" cure. Parametric mixture cure model analysis on grouped relative survival curves provide an interesting way to determine the proportion of statistically cured cases and the mean survival time of the fatal cases in particular for population-based cancer registries. Based on the relative survival data from the Belgian Cancer Registry, parametric cure models were applied to seven cancer sites (cervix, colon, corpus uteri, skin melanoma, pancreas, stomach and oesophagus), at the Flemish Regional level for the incidence period 1999-2011. Statistical cure was observed for the examined cancer sites except for oesophageal cancer. The estimated cured proportion ranged from 5.9% [5.7, 6.1] for pancreatic cancer to 80.8% [80.5, 81.2] for skin melanoma. Cure results were further stratified by gender or age group. Stratified cured proportions were higher for females compared to males in colon cancer, stomach cancer, pancreas cancer and skin melanoma, which can mainly be attributed to differences in stage and age distribution between both sexes. This study demonstrates the applicability of cure rate models for the selected cancer sites after 14 years of follow-up and presents the first population-based results on the cure of cancer in Belgium. © 2016 UICC.

  9. The effect of health shocks on smoking and obesity.

    PubMed

    Sundmacher, Leonie

    2012-08-01

    To investigate whether negative changes in their own health (i.e. health shocks) or in that of a smoking or obese household member, lead smokers to quit smoking and obese individuals to lose weight. The study is informed by economic models ('rational addiction' and 'demand for health' models) which offer hypotheses on the relationship between health shocks and health-related behaviour. Each hypothesis was tested applying a discrete-time hazard model with random effects using up to ten waves of the German Socioeconomic Panel (GSOEP) and statistics on cigarette, food and beverage prices provided by the Federal Statistical Office. Health shocks had a significant positive impact on the probability that smokers quit during the same year in which they experienced the health shock. Health shocks of a smoking household member between year t-2 and t-1 also motivated smoking cessation, although statistical evidence for this was weaker. Health shocks experienced by obese individuals or their household members had, on the other hand, no significant effect on weight loss, as measured by changes in Body Mass Index (BMI). The results of the study suggest that smokers are aware of the risks associated with tobacco consumption, know about effective strategies to quit smoking, and are willing to quit for health-related reasons. In contrast, there was no evidence for changes in health-related behaviour among obese individuals after a health shock.

  10. Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

    NASA Astrophysics Data System (ADS)

    McCray, Wilmon Wil L., Jr.

    The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization model and dashboard that demonstrates the use of statistical methods, statistical process control, sensitivity analysis, quantitative and optimization techniques to establish a baseline and predict future customer satisfaction index scores (outcomes). The American Customer Satisfaction Index (ACSI) model and industry benchmarks were used as a framework for the simulation model.

  11. A mathematical model for HIV and hepatitis C co-infection and its assessment from a statistical perspective.

    PubMed

    Castro Sanchez, Amparo Yovanna; Aerts, Marc; Shkedy, Ziv; Vickerman, Peter; Faggiano, Fabrizio; Salamina, Guiseppe; Hens, Niel

    2013-03-01

    The hepatitis C virus (HCV) and the human immunodeficiency virus (HIV) are a clear threat for public health, with high prevalences especially in high risk groups such as injecting drug users. People with HIV infection who are also infected by HCV suffer from a more rapid progression to HCV-related liver disease and have an increased risk for cirrhosis and liver cancer. Quantifying the impact of HIV and HCV co-infection is therefore of great importance. We propose a new joint mathematical model accounting for co-infection with the two viruses in the context of injecting drug users (IDUs). Statistical concepts and methods are used to assess the model from a statistical perspective, in order to get further insights in: (i) the comparison and selection of optional model components, (ii) the unknown values of the numerous model parameters, (iii) the parameters to which the model is most 'sensitive' and (iv) the combinations or patterns of values in the high-dimensional parameter space which are most supported by the data. Data from a longitudinal study of heroin users in Italy are used to illustrate the application of the proposed joint model and its statistical assessment. The parameters associated with contact rates (sharing syringes) and the transmission rates per syringe-sharing event are shown to play a major role. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Functional constraints on tooth morphology in carnivorous mammals

    PubMed Central

    2012-01-01

    Background The range of potential morphologies resulting from evolution is limited by complex interacting processes, ranging from development to function. Quantifying these interactions is important for understanding adaptation and convergent evolution. Using three-dimensional reconstructions of carnivoran and dasyuromorph tooth rows, we compared statistical models of the relationship between tooth row shape and the opposing tooth row, a static feature, as well as measures of mandibular motion during chewing (occlusion), which are kinetic features. This is a new approach to quantifying functional integration because we use measures of movement and displacement, such as the amount the mandible translates laterally during occlusion, as opposed to conventional morphological measures, such as mandible length and geometric landmarks. By sampling two distantly related groups of ecologically similar mammals, we study carnivorous mammals in general rather than a specific group of mammals. Results Statistical model comparisons demonstrate that the best performing models always include some measure of mandibular motion, indicating that functional and statistical models of tooth shape as purely a function of the opposing tooth row are too simple and that increased model complexity provides a better understanding of tooth form. The predictors of the best performing models always included the opposing tooth row shape and a relative linear measure of mandibular motion. Conclusions Our results provide quantitative support of long-standing hypotheses of tooth row shape as being influenced by mandibular motion in addition to the opposing tooth row. Additionally, this study illustrates the utility and necessity of including kinetic features in analyses of morphological integration. PMID:22899809

  13. Prediction of drug transport processes using simple parameters and PLS statistics. The use of ACD/logP and ACD/ChemSketch descriptors.

    PubMed

    Osterberg, T; Norinder, U

    2001-01-01

    A method of modelling and predicting biopharmaceutical properties using simple theoretically computed molecular descriptors and multivariate statistics has been investigated for several data sets related to solubility, IAM chromatography, permeability across Caco-2 cell monolayers, human intestinal perfusion, brain-blood partitioning, and P-glycoprotein ATPase activity. The molecular descriptors (e.g. molar refractivity, molar volume, index of refraction, surface tension and density) and logP were computed with ACD/ChemSketch and ACD/logP, respectively. Good statistical models were derived that permit simple computational prediction of biopharmaceutical properties. All final models derived had R(2) values ranging from 0.73 to 0.95 and Q(2) values ranging from 0.69 to 0.86. The RMSEP values for the external test sets ranged from 0.24 to 0.85 (log scale).

  14. Incorporating GIS building data and census housing statistics for sub-block-level population estimation

    USGS Publications Warehouse

    Wu, S.-S.; Wang, L.; Qiu, X.

    2008-01-01

    This article presents a deterministic model for sub-block-level population estimation based on the total building volumes derived from geographic information system (GIS) building data and three census block-level housing statistics. To assess the model, we generated artificial blocks by aggregating census block areas and calculating the respective housing statistics. We then applied the model to estimate populations for sub-artificial-block areas and assessed the estimates with census populations of the areas. Our analyses indicate that the average percent error of population estimation for sub-artificial-block areas is comparable to those for sub-census-block areas of the same size relative to associated blocks. The smaller the sub-block-level areas, the higher the population estimation errors. For example, the average percent error for residential areas is approximately 0.11 percent for 100 percent block areas and 35 percent for 5 percent block areas.

  15. Generalized linear and generalized additive models in studies of species distributions: Setting the scene

    USGS Publications Warehouse

    Guisan, Antoine; Edwards, T.C.; Hastie, T.

    2002-01-01

    An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.

  16. Model averaging techniques for quantifying conceptual model uncertainty.

    PubMed

    Singh, Abhishek; Mishra, Srikanta; Ruskauff, Greg

    2010-01-01

    In recent years a growing understanding has emerged regarding the need to expand the modeling paradigm to include conceptual model uncertainty for groundwater models. Conceptual model uncertainty is typically addressed by formulating alternative model conceptualizations and assessing their relative likelihoods using statistical model averaging approaches. Several model averaging techniques and likelihood measures have been proposed in the recent literature for this purpose with two broad categories--Monte Carlo-based techniques such as Generalized Likelihood Uncertainty Estimation or GLUE (Beven and Binley 1992) and criterion-based techniques that use metrics such as the Bayesian and Kashyap Information Criteria (e.g., the Maximum Likelihood Bayesian Model Averaging or MLBMA approach proposed by Neuman 2003) and Akaike Information Criterion-based model averaging (AICMA) (Poeter and Anderson 2005). These different techniques can often lead to significantly different relative model weights and ranks because of differences in the underlying statistical assumptions about the nature of model uncertainty. This paper provides a comparative assessment of the four model averaging techniques (GLUE, MLBMA with KIC, MLBMA with BIC, and AIC-based model averaging) mentioned above for the purpose of quantifying the impacts of model uncertainty on groundwater model predictions. Pros and cons of each model averaging technique are examined from a practitioner's perspective using two groundwater modeling case studies. Recommendations are provided regarding the use of these techniques in groundwater modeling practice.

  17. A Statistical Multimodel Ensemble Approach to Improving Long-Range Forecasting in Pakistan

    DTIC Science & Technology

    2012-03-01

    Impact of global warming on monsoon variability in Pakistan. J. Anim. Pl. Sci., 21, no. 1, 107–110. Gillies, S., T. Murphree, and D. Meyer, 2012...are generated by multiple regression models that relate globally distributed oceanic and atmospheric predictors to local predictands. The...generated by multiple regression models that relate globally distributed oceanic and atmospheric predictors to local predictands. The predictands are

  18. Improved Forecasting Methods for Naval Manpower Studies

    DTIC Science & Technology

    2015-03-25

    Using monthly data is likely to improve the overall fit of the models and the accuracy of the BP test . A measure of unemployment to control for...measure of the relative goodness of fit of a statistical model. It is grounded in the concept of information entropy, in effect, offering a relative...the Kullback – Leibler divergence, DKL(f,g1); similarly, the information lost from using g2 to

  19. Visualization of Spatio-Temporal Relations in Movement Event Using Multi-View

    NASA Astrophysics Data System (ADS)

    Zheng, K.; Gu, D.; Fang, F.; Wang, Y.; Liu, H.; Zhao, W.; Zhang, M.; Li, Q.

    2017-09-01

    Spatio-temporal relations among movement events extracted from temporally varying trajectory data can provide useful information about the evolution of individual or collective movers, as well as their interactions with their spatial and temporal contexts. However, the pure statistical tools commonly used by analysts pose many difficulties, due to the large number of attributes embedded in multi-scale and multi-semantic trajectory data. The need for models that operate at multiple scales to search for relations at different locations within time and space, as well as intuitively interpret what these relations mean, also presents challenges. Since analysts do not know where or when these relevant spatio-temporal relations might emerge, these models must compute statistical summaries of multiple attributes at different granularities. In this paper, we propose a multi-view approach to visualize the spatio-temporal relations among movement events. We describe a method for visualizing movement events and spatio-temporal relations that uses multiple displays. A visual interface is presented, and the user can interactively select or filter spatial and temporal extents to guide the knowledge discovery process. We also demonstrate how this approach can help analysts to derive and explain the spatio-temporal relations of movement events from taxi trajectory data.

  20. Identifying trends in climate: an application to the cenozoic

    NASA Astrophysics Data System (ADS)

    Richards, Gordon R.

    1998-05-01

    The recent literature on trending in climate has raised several issues, whether trends should be modeled as deterministic or stochastic, whether trends are nonlinear, and the relative merits of statistical models versus models based on physics. This article models trending since the late Cretaceous. This 68 million-year interval is selected because the reliability of tests for trending is critically dependent on the length of time spanned by the data. Two main hypotheses are tested, that the trend has been caused primarily by CO2 forcing, and that it reflects a variety of forcing factors which can be approximated by statistical methods. The CO2 data is obtained from model simulations. Several widely-used statistical models are found to be inadequate. ARIMA methods parameterize too much of the short-term variation, and do not identify low frequency movements. Further, the unit root in the ARIMA process does not predict the long-term path of temperature. Spectral methods also have little ability to predict temperature at long horizons. Instead, the statistical trend is estimated using a nonlinear smoothing filter. Both of these paradigms make it possible to model climate as a cointegrated process, in which temperature can wander quite far from the trend path in the intermediate term, but converges back over longer horizons. Comparing the forecasting properties of the two trend models demonstrates that the optimal forecasting model includes CO2 forcing and a parametric representation of the nonlinear variability in climate.

  1. Fragment size distribution statistics in dynamic fragmentation of laser shock-loaded tin

    NASA Astrophysics Data System (ADS)

    He, Weihua; Xin, Jianting; Zhao, Yongqiang; Chu, Genbai; Xi, Tao; Shui, Min; Lu, Feng; Gu, Yuqiu

    2017-06-01

    This work investigates the geometric statistics method to characterize the size distribution of tin fragments produced in the laser shock-loaded dynamic fragmentation process. In the shock experiments, the ejection of the tin sample with etched V-shape groove in the free surface are collected by the soft recovery technique. Subsequently, the produced fragments are automatically detected with the fine post-shot analysis techniques including the X-ray micro-tomography and the improved watershed method. To characterize the size distributions of the fragments, a theoretical random geometric statistics model based on Poisson mixtures is derived for dynamic heterogeneous fragmentation problem, which reveals linear combinational exponential distribution. The experimental data related to fragment size distributions of the laser shock-loaded tin sample are examined with the proposed theoretical model, and its fitting performance is compared with that of other state-of-the-art fragment size distribution models. The comparison results prove that our proposed model can provide far more reasonable fitting result for the laser shock-loaded tin.

  2. Seasonal Drought Prediction: Advances, Challenges, and Future Prospects

    NASA Astrophysics Data System (ADS)

    Hao, Zengchao; Singh, Vijay P.; Xia, Youlong

    2018-03-01

    Drought prediction is of critical importance to early warning for drought managements. This review provides a synthesis of drought prediction based on statistical, dynamical, and hybrid methods. Statistical drought prediction is achieved by modeling the relationship between drought indices of interest and a suite of potential predictors, including large-scale climate indices, local climate variables, and land initial conditions. Dynamical meteorological drought prediction relies on seasonal climate forecast from general circulation models (GCMs), which can be employed to drive hydrological models for agricultural and hydrological drought prediction with the predictability determined by both climate forcings and initial conditions. Challenges still exist in drought prediction at long lead time and under a changing environment resulting from natural and anthropogenic factors. Future research prospects to improve drought prediction include, but are not limited to, high-quality data assimilation, improved model development with key processes related to drought occurrence, optimal ensemble forecast to select or weight ensembles, and hybrid drought prediction to merge statistical and dynamical forecasts.

  3. Systematic Mapping and Statistical Analyses of Valley Landform and Vegetation Asymmetries Across Hydroclimatic Gradients

    NASA Astrophysics Data System (ADS)

    Poulos, M. J.; Pierce, J. L.; McNamara, J. P.; Flores, A. N.; Benner, S. G.

    2015-12-01

    Terrain aspect alters the spatial distribution of insolation across topography, driving eco-pedo-hydro-geomorphic feedbacks that can alter landform evolution and result in valley asymmetries for a suite of land surface characteristics (e.g. slope length and steepness, vegetation, soil properties, and drainage development). Asymmetric valleys serve as natural laboratories for studying how landscapes respond to climate perturbation. In the semi-arid montane granodioritic terrain of the Idaho batholith, Northern Rocky Mountains, USA, prior works indicate that reduced insolation on northern (pole-facing) aspects prolongs snow pack persistence, and is associated with thicker, finer-grained soils, that retain more water, prolong the growing season, support coniferous forest rather than sagebrush steppe ecosystems, stabilize slopes at steeper angles, and produce sparser drainage networks. We hypothesize that the primary drivers of valley asymmetry development are changes in the pedon-scale water-balance that coalesce to alter catchment-scale runoff and drainage development, and ultimately cause the divide between north and south-facing land surfaces to migrate northward. We explore this conceptual framework by coupling land surface analyses with statistical modeling to assess relationships and the relative importance of land surface characteristics. Throughout the Idaho batholith, we systematically mapped and tabulated various statistical measures of landforms, land cover, and hydroclimate within discrete valley segments (n=~10,000). We developed a random forest based statistical model to predict valley slope asymmetry based upon numerous measures (n>300) of landscape asymmetries. Preliminary results suggest that drainages are tightly coupled with hillslopes throughout the region, with drainage-network slope being one of the strongest predictors of land-surface-averaged slope asymmetry. When slope-related statistics are excluded, due to possible autocorrelation, valley slope asymmetry is most strongly predicted by asymmetries of insolation and drainage density, which generally supports a water-balance based conceptual model of valley asymmetry development. Surprisingly, vegetation asymmetries had relatively low predictive importance.

  4. The power and robustness of maximum LOD score statistics.

    PubMed

    Yoo, Y J; Mendell, N R

    2008-07-01

    The maximum LOD score statistic is extremely powerful for gene mapping when calculated using the correct genetic parameter value. When the mode of genetic transmission is unknown, the maximum of the LOD scores obtained using several genetic parameter values is reported. This latter statistic requires higher critical value than the maximum LOD score statistic calculated from a single genetic parameter value. In this paper, we compare the power of maximum LOD scores based on three fixed sets of genetic parameter values with the power of the LOD score obtained after maximizing over the entire range of genetic parameter values. We simulate family data under nine generating models. For generating models with non-zero phenocopy rates, LOD scores maximized over the entire range of genetic parameters yielded greater power than maximum LOD scores for fixed sets of parameter values with zero phenocopy rates. No maximum LOD score was consistently more powerful than the others for generating models with a zero phenocopy rate. The power loss of the LOD score maximized over the entire range of genetic parameters, relative to the maximum LOD score calculated using the correct genetic parameter value, appeared to be robust to the generating models.

  5. Statistics of Optical Coherence Tomography Data From Human Retina

    PubMed Central

    de Juan, Joaquín; Ferrone, Claudia; Giannini, Daniela; Huang, David; Koch, Giorgio; Russo, Valentina; Tan, Ou; Bruni, Carlo

    2010-01-01

    Optical coherence tomography (OCT) has recently become one of the primary methods for noninvasive probing of the human retina. The pseudoimage formed by OCT (the so-called B-scan) varies probabilistically across pixels due to complexities in the measurement technique. Hence, sensitive automatic procedures of diagnosis using OCT may exploit statistical analysis of the spatial distribution of reflectance. In this paper, we perform a statistical study of retinal OCT data. We find that the stretched exponential probability density function can model well the distribution of intensities in OCT pseudoimages. Moreover, we show a small, but significant correlation between neighbor pixels when measuring OCT intensities with pixels of about 5 µm. We then develop a simple joint probability model for the OCT data consistent with known retinal features. This model fits well the stretched exponential distribution of intensities and their spatial correlation. In normal retinas, fit parameters of this model are relatively constant along retinal layers, but varies across layers. However, in retinas with diabetic retinopathy, large spikes of parameter modulation interrupt the constancy within layers, exactly where pathologies are visible. We argue that these results give hope for improvement in statistical pathology-detection methods even when the disease is in its early stages. PMID:20304733

  6. Cognitive predictors of balance in Parkinson's disease.

    PubMed

    Fernandes, Ângela; Mendes, Andreia; Rocha, Nuno; Tavares, João Manuel R S

    2016-06-01

    Postural instability is one of the most incapacitating symptoms of Parkinson's disease (PD) and appears to be related to cognitive deficits. This study aims to determine the cognitive factors that can predict deficits in static and dynamic balance in individuals with PD. A sociodemographic questionnaire characterized 52 individuals with PD for this work. The Trail Making Test, Rule Shift Cards Test, and Digit Span Test assessed the executive functions. The static balance was assessed using a plantar pressure platform, and dynamic balance was based on the Timed Up and Go Test. The results were statistically analysed using SPSS Statistics software through linear regression analysis. The results show that a statistically significant model based on cognitive outcomes was able to explain the variance of motor variables. Also, the explanatory value of the model tended to increase with the addition of individual and clinical variables, although the resulting model was not statistically significant The model explained 25-29% of the variability of the Timed Up and Go Test, while for the anteroposterior displacement it was 23-34%, and for the mediolateral displacement it was 24-39%. From the findings, we conclude that the cognitive performance, especially the executive functions, is a predictor of balance deficit in individuals with PD.

  7. Statistical physics of vehicular traffic and some related systems

    NASA Astrophysics Data System (ADS)

    Chowdhury, Debashish; Santen, Ludger; Schadschneider, Andreas

    2000-05-01

    In the so-called “microscopic” models of vehicular traffic, attention is paid explicitly to each individual vehicle each of which is represented by a “particle”; the nature of the “interactions” among these particles is determined by the way the vehicles influence each others’ movement. Therefore, vehicular traffic, modeled as a system of interacting “particles” driven far from equilibrium, offers the possibility to study various fundamental aspects of truly nonequilibrium systems which are of current interest in statistical physics. Analytical as well as numerical techniques of statistical physics are being used to study these models to understand rich variety of physical phenomena exhibited by vehicular traffic. Some of these phenomena, observed in vehicular traffic under different circumstances, include transitions from one dynamical phase to another, criticality and self-organized criticality, metastability and hysteresis, phase-segregation, etc. In this critical review, written from the perspective of statistical physics, we explain the guiding principles behind all the main theoretical approaches. But we present detailed discussions on the results obtained mainly from the so-called “particle-hopping” models, particularly emphasizing those which have been formulated in recent years using the language of cellular automata.

  8. Treated cabin acoustic prediction using statistical energy analysis

    NASA Technical Reports Server (NTRS)

    Yoerkie, Charles A.; Ingraham, Steven T.; Moore, James A.

    1987-01-01

    The application of statistical energy analysis (SEA) to the modeling and design of helicopter cabin interior noise control treatment is demonstrated. The information presented here is obtained from work sponsored at NASA Langley for the development of analytic modeling techniques and the basic understanding of cabin noise. Utility and executive interior models are developed directly from existing S-76 aircraft designs. The relative importance of panel transmission loss (TL), acoustic leakage, and absorption to the control of cabin noise is shown using the SEA modeling parameters. It is shown that the major cabin noise improvement below 1000 Hz comes from increased panel TL, while above 1000 Hz it comes from reduced acoustic leakage and increased absorption in the cabin and overhead cavities.

  9. Relative Contributions of Agricultural Drift, Para-Occupational, and Residential Use Exposure Pathways to House Dust Pesticide Concentrations: Meta-Regression of Published Data.

    PubMed

    Deziel, Nicole C; Freeman, Laura E Beane; Graubard, Barry I; Jones, Rena R; Hoppin, Jane A; Thomas, Kent; Hines, Cynthia J; Blair, Aaron; Sandler, Dale P; Chen, Honglei; Lubin, Jay H; Andreotti, Gabriella; Alavanja, Michael C R; Friesen, Melissa C

    2017-03-01

    Increased pesticide concentrations in house dust in agricultural areas have been attributed to several exposure pathways, including agricultural drift, para-occupational, and residential use. To guide future exposure assessment efforts, we quantified relative contributions of these pathways using meta-regression models of published data on dust pesticide concentrations. From studies in North American agricultural areas published from 1995 to 2015, we abstracted dust pesticide concentrations reported as summary statistics [e.g., geometric means (GM)]. We analyzed these data using mixed-effects meta-regression models that weighted each summary statistic by its inverse variance. Dependent variables were either the log-transformed GM (drift) or the log-transformed ratio of GMs from two groups (para-occupational, residential use). For the drift pathway, predicted GMs decreased sharply and nonlinearly, with GMs 64% lower in homes 250 m versus 23 m from fields (interquartile range of published data) based on 52 statistics from seven studies. For the para-occupational pathway, GMs were 2.3 times higher [95% confidence interval (CI): 1.5, 3.3; 15 statistics, five studies] in homes of farmers who applied pesticides more recently or frequently versus less recently or frequently. For the residential use pathway, GMs were 1.3 (95% CI: 1.1, 1.4) and 1.5 (95% CI: 1.2, 1.9) times higher in treated versus untreated homes, when the probability that a pesticide was used for the pest treatment was 1-19% and ≥ 20%, respectively (88 statistics, five studies). Our quantification of the relative contributions of pesticide exposure pathways in agricultural populations could improve exposure assessments in epidemiologic studies. The meta-regression models can be updated when additional data become available. Citation: Deziel NC, Beane Freeman LE, Graubard BI, Jones RR, Hoppin JA, Thomas K, Hines CJ, Blair A, Sandler DP, Chen H, Lubin JH, Andreotti G, Alavanja MC, Friesen MC. 2017. Relative contributions of agricultural drift, para-occupational, and residential use exposure pathways to house dust pesticide concentrations: meta-regression of published data. Environ Health Perspect 125:296-305; http://dx.doi.org/10.1289/EHP426.

  10. Relative Contributions of Agricultural Drift, Para-Occupational, and Residential Use Exposure Pathways to House Dust Pesticide Concentrations: Meta-Regression of Published Data

    PubMed Central

    Deziel, Nicole C.; Freeman, Laura E. Beane; Graubard, Barry I.; Jones, Rena R.; Hoppin, Jane A.; Thomas, Kent; Hines, Cynthia J.; Blair, Aaron; Sandler, Dale P.; Chen, Honglei; Lubin, Jay H.; Andreotti, Gabriella; Alavanja, Michael C. R.; Friesen, Melissa C.

    2016-01-01

    Background: Increased pesticide concentrations in house dust in agricultural areas have been attributed to several exposure pathways, including agricultural drift, para-occupational, and residential use. Objective: To guide future exposure assessment efforts, we quantified relative contributions of these pathways using meta-regression models of published data on dust pesticide concentrations. Methods: From studies in North American agricultural areas published from 1995 to 2015, we abstracted dust pesticide concentrations reported as summary statistics [e.g., geometric means (GM)]. We analyzed these data using mixed-effects meta-regression models that weighted each summary statistic by its inverse variance. Dependent variables were either the log-transformed GM (drift) or the log-transformed ratio of GMs from two groups (para-occupational, residential use). Results: For the drift pathway, predicted GMs decreased sharply and nonlinearly, with GMs 64% lower in homes 250 m versus 23 m from fields (interquartile range of published data) based on 52 statistics from seven studies. For the para-occupational pathway, GMs were 2.3 times higher [95% confidence interval (CI): 1.5, 3.3; 15 statistics, five studies] in homes of farmers who applied pesticides more recently or frequently versus less recently or frequently. For the residential use pathway, GMs were 1.3 (95% CI: 1.1, 1.4) and 1.5 (95% CI: 1.2, 1.9) times higher in treated versus untreated homes, when the probability that a pesticide was used for the pest treatment was 1–19% and ≥ 20%, respectively (88 statistics, five studies). Conclusion: Our quantification of the relative contributions of pesticide exposure pathways in agricultural populations could improve exposure assessments in epidemiologic studies. The meta-regression models can be updated when additional data become available. Citation: Deziel NC, Beane Freeman LE, Graubard BI, Jones RR, Hoppin JA, Thomas K, Hines CJ, Blair A, Sandler DP, Chen H, Lubin JH, Andreotti G, Alavanja MC, Friesen MC. 2017. Relative contributions of agricultural drift, para-occupational, and residential use exposure pathways to house dust pesticide concentrations: meta-regression of published data. Environ Health Perspect 125:296–305; http://dx.doi.org/10.1289/EHP426 PMID:27458779

  11. Exploring patient satisfaction predictors in relation to a theoretical model.

    PubMed

    Grøndahl, Vigdis Abrahamsen; Hall-Lord, Marie Louise; Karlsson, Ingela; Appelgren, Jari; Wilde-Larsson, Bodil

    2013-01-01

    The aim is to describe patients' care quality perceptions and satisfaction and to explore potential patient satisfaction predictors as person-related conditions, external objective care conditions and patients' perception of actual care received ("PR") in relation to a theoretical model. A cross-sectional design was used. Data were collected using one questionnaire combining questions from four instruments: Quality from patients' perspective; Sense of coherence; Big five personality trait; and Emotional stress reaction questionnaire (ESRQ), together with questions from previous research. In total, 528 patients (83.7 per cent response rate) from eight medical, three surgical and one medical/surgical ward in five Norwegian hospitals participated. Answers from 373 respondents with complete ESRQ questionnaires were analysed. Sequential multiple regression analysis with ESRQ as dependent variable was run in three steps: person-related conditions, external objective care conditions, and PR (p < 0.05). Step 1 (person-related conditions) explained 51.7 per cent of the ESRQ variance. Step 2 (external objective care conditions) explained an additional 2.4 per cent. Step 3 (PR) gave no significant additional explanation (0.05 per cent). Steps 1 and 2 contributed statistical significance to the model. Patients rated both quality-of-care and satisfaction highly. The paper shows that the theoretical model using an emotion-oriented approach to assess patient satisfaction can explain 54 per cent of patient satisfaction in a statistically significant manner.

  12. Velocity statistics of the Nagel-Schreckenberg model

    NASA Astrophysics Data System (ADS)

    Bain, Nicolas; Emig, Thorsten; Ulm, Franz-Josef; Schreckenberg, Michael

    2016-02-01

    The statistics of velocities in the cellular automaton model of Nagel and Schreckenberg for traffic are studied. From numerical simulations, we obtain the probability distribution function (PDF) for vehicle velocities and the velocity-velocity (vv) covariance function. We identify the probability to find a standing vehicle as a potential order parameter that signals nicely the transition between free congested flow for a sufficiently large number of velocity states. Our results for the vv covariance function resemble features of a second-order phase transition. We develop a 3-body approximation that allows us to relate the PDFs for velocities and headways. Using this relation, an approximation to the velocity PDF is obtained from the headway PDF observed in simulations. We find a remarkable agreement between this approximation and the velocity PDF obtained from simulations.

  13. Velocity statistics of the Nagel-Schreckenberg model.

    PubMed

    Bain, Nicolas; Emig, Thorsten; Ulm, Franz-Josef; Schreckenberg, Michael

    2016-02-01

    The statistics of velocities in the cellular automaton model of Nagel and Schreckenberg for traffic are studied. From numerical simulations, we obtain the probability distribution function (PDF) for vehicle velocities and the velocity-velocity (vv) covariance function. We identify the probability to find a standing vehicle as a potential order parameter that signals nicely the transition between free congested flow for a sufficiently large number of velocity states. Our results for the vv covariance function resemble features of a second-order phase transition. We develop a 3-body approximation that allows us to relate the PDFs for velocities and headways. Using this relation, an approximation to the velocity PDF is obtained from the headway PDF observed in simulations. We find a remarkable agreement between this approximation and the velocity PDF obtained from simulations.

  14. Word recognition and phonetic structure acquisition: Possible relations

    NASA Astrophysics Data System (ADS)

    Morgan, James

    2002-05-01

    Several accounts of possible relations between the emergence of the mental lexicon and acquisition of native language phonological structure have been propounded. In one view, acquisition of word meanings guides infants' attention toward those contrasts that are linguistically significant in their language. In the opposing view, native language phonological categories may be acquired from statistical patterns of input speech, prior to and independent of learning at the lexical level. Here, a more interactive account will be presented, in which phonological structure is modeled as emerging consequentially from the self-organization of perceptual space underlying word recognition. A key prediction of this model is that early native language phonological categories will be highly context specific. Data bearing on this prediction will be presented which provide clues to the nature of infants' statistical analysis of input.

  15. A Prospective Test of the Stress-Buffering Model of Depression in Adolescent Girls: No Support Once Again

    ERIC Educational Resources Information Center

    Burton, Emily; Stice, Eric; Seeley, John R.

    2004-01-01

    The stress-buffering model posits that social support mitigates the relation between negative life events and onset of depression, but prospective studies have provided little support for this assertion. The authors sought to provide a more sensitive test of this model by addressing certain methodological and statistical limitations of past…

  16. Perturbation Selection and Local Influence Analysis for Nonlinear Structural Equation Model

    ERIC Educational Resources Information Center

    Chen, Fei; Zhu, Hong-Tu; Lee, Sik-Yum

    2009-01-01

    Local influence analysis is an important statistical method for studying the sensitivity of a proposed model to model inputs. One of its important issues is related to the appropriate choice of a perturbation vector. In this paper, we develop a general method to select an appropriate perturbation vector and a second-order local influence measure…

  17. Designing a Qualitative Data Collection Strategy (QDCS) for Africa - Phase 1: A Gap Analysis of Existing Models, Simulations, and Tools Relating to Africa

    DTIC Science & Technology

    2012-06-01

    generalized behavioral model characterized after the fictional Seldon equations (the one elaborated upon by Isaac Asimov in the 1951 novel, The...Foundation). Asimov described the Seldon equations as essentially statistical models with historical data of a sufficient size and variability that they

  18. Modeling forest scenic beauty: Concepts and application to ponderosa pine

    Treesearch

    Thomas C. Brown; Terry C. Daniel

    1984-01-01

    Statistical models are presented which relate near-view scenic beauty of ponderosa pine stands in the Southwest to variables describing physical characteristics. The models suggest that herbage and large ponderosa pine contribute to scenic beauty, while numbers of small and intermediate-sized pine trees and downed wood, especially as slash, detract from scenic beauty....

  19. Benefits of statistical molecular design, covariance analysis, and reference models in QSAR: a case study on acetylcholinesterase

    NASA Astrophysics Data System (ADS)

    Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna

    2015-03-01

    Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.

  20. Improving production efficiency through genetic selection

    USDA-ARS?s Scientific Manuscript database

    The goal of dairy cattle breeding is to increase productivity and efficiency by means of genetic selection. This is possible because related animals share some of their DNA in common, and we can use statistical models to predict the genetic merit animals based on the performance of their relatives. ...

  1. Background Knowledge in Learning-Based Relation Extraction

    ERIC Educational Resources Information Center

    Do, Quang Xuan

    2012-01-01

    In this thesis, we study the importance of background knowledge in relation extraction systems. We not only demonstrate the benefits of leveraging background knowledge to improve the systems' performance but also propose a principled framework that allows one to effectively incorporate knowledge into statistical machine learning models for…

  2. Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function.

    PubMed

    Yang, James J; Williams, L Keoki; Buu, Anne

    2017-08-24

    A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.

  3. Neurocognitive mechanisms of statistical-sequential learning: what do event-related potentials tell us?

    PubMed Central

    Daltrozzo, Jerome; Conway, Christopher M.

    2014-01-01

    Statistical-sequential learning (SL) is the ability to process patterns of environmental stimuli, such as spoken language, music, or one’s motor actions, that unfold in time. The underlying neurocognitive mechanisms of SL and the associated cognitive representations are still not well understood as reflected by the heterogeneity of the reviewed cognitive models. The purpose of this review is: (1) to provide a general overview of the primary models and theories of SL, (2) to describe the empirical research – with a focus on the event-related potential (ERP) literature – in support of these models while also highlighting the current limitations of this research, and (3) to present a set of new lines of ERP research to overcome these limitations. The review is articulated around three descriptive dimensions in relation to SL: the level of abstractness of the representations learned through SL, the effect of the level of attention and consciousness on SL, and the developmental trajectory of SL across the life-span. We conclude with a new tentative model that takes into account these three dimensions and also point to several promising new lines of SL research. PMID:24994975

  4. The challenging use and interpretation of circulating biomarkers of exposure to persistent organic pollutants in environmental health: Comparison of lipid adjustment approaches in a case study related to endometriosis.

    PubMed

    Cano-Sancho, German; Labrune, Léa; Ploteau, Stéphane; Marchand, Philippe; Le Bizec, Bruno; Antignac, Jean-Philippe

    2018-06-01

    The gold-standard matrix for measuring the internal levels of persistent organic pollutants (POPs) is the adipose tissue, however in epidemiological studies the use of serum is preferred due to the low cost and higher accessibility. The interpretation of serum biomarkers is tightly related to the understanding of the underlying causal structure relating the POPs, serum lipids and the disease. Considering the extended benefits of using serum biomarkers we aimed to further examine if through statistical modelling we would be able to improve the use and interpretation of serum biomarkers in the study of endometriosis. Hence, we have conducted a systematic comparison of statistical approaches commonly used to lipid-adjust the circulating biomarkers of POPs based on existing methods, using data from a pilot case-control study focused on severe deep infiltrating endometriosis. The odds ratios (ORs) obtained from unconditional regression for those models with serum biomarkers were further compared to those obtained from adipose tissue. The results of this exploratory study did not support the use of blood biomarkers as proxy estimates of POPs in adipose tissue to implement in risk models for endometriosis with the available statistical approaches to correct for lipids. The current statistical approaches commonly used to lipid-adjust circulating POPs, do not fully represent the underlying biological complexity between POPs, lipids and disease (especially those directly or indirectly affecting or affected by lipid metabolism). Hence, further investigations are warranted to improve the use and interpretation of blood biomarkers under complex scenarios of lipid dynamics. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. A statistical physics viewpoint on the dynamics of the bouncing ball

    NASA Astrophysics Data System (ADS)

    Chastaing, Jean-Yonnel; Géminard, Jean-Christophe; Bertin, Eric

    2016-06-01

    We compute, in a statistical physics perspective, the dynamics of a bouncing ball maintained in a chaotic regime thanks to collisions with a plate experiencing an aperiodic vibration. We analyze in details the energy exchanges between the bead and the vibrating plate, and show that the coupling between the bead and the plate can be modeled in terms of both a dissipative process and an injection mechanism by an energy reservoir. An analysis of the injection statistics in terms of fluctuation relation is also provided.

  6. [Application of statistics on chronic-diseases-relating observational research papers].

    PubMed

    Hong, Zhi-heng; Wang, Ping; Cao, Wei-hua

    2012-09-01

    To study the application of statistics on Chronic-diseases-relating observational research papers which were recently published in the Chinese Medical Association Magazines, with influential index above 0.5. Using a self-developed criterion, two investigators individually participated in assessing the application of statistics on Chinese Medical Association Magazines, with influential index above 0.5. Different opinions reached an agreement through discussion. A total number of 352 papers from 6 magazines, including the Chinese Journal of Epidemiology, Chinese Journal of Oncology, Chinese Journal of Preventive Medicine, Chinese Journal of Cardiology, Chinese Journal of Internal Medicine and Chinese Journal of Endocrinology and Metabolism, were reviewed. The rate of clear statement on the following contents as: research objectives, t target audience, sample issues, objective inclusion criteria and variable definitions were 99.43%, 98.57%, 95.43%, 92.86% and 96.87%. The correct rates of description on quantitative and qualitative data were 90.94% and 91.46%, respectively. The rates on correctly expressing the results, on statistical inference methods related to quantitative, qualitative data and modeling were 100%, 95.32% and 87.19%, respectively. 89.49% of the conclusions could directly response to the research objectives. However, 69.60% of the papers did not mention the exact names of the study design, statistically, that the papers were using. 11.14% of the papers were in lack of further statement on the exclusion criteria. Percentage of the papers that could clearly explain the sample size estimation only taking up as 5.16%. Only 24.21% of the papers clearly described the variable value assignment. Regarding the introduction on statistical conduction and on database methods, the rate was only 24.15%. 18.75% of the papers did not express the statistical inference methods sufficiently. A quarter of the papers did not use 'standardization' appropriately. As for the aspect of statistical inference, the rate of description on statistical testing prerequisite was only 24.12% while 9.94% papers did not even employ the statistical inferential method that should be used. The main deficiencies on the application of Statistics used in papers related to Chronic-diseases-related observational research were as follows: lack of sample-size determination, variable value assignment description not sufficient, methods on statistics were not introduced clearly or properly, lack of consideration for pre-requisition regarding the use of statistical inferences.

  7. Imputation approaches for animal movement modeling

    USGS Publications Warehouse

    Scharf, Henry; Hooten, Mevin B.; Johnson, Devin S.

    2017-01-01

    The analysis of telemetry data is common in animal ecological studies. While the collection of telemetry data for individual animals has improved dramatically, the methods to properly account for inherent uncertainties (e.g., measurement error, dependence, barriers to movement) have lagged behind. Still, many new statistical approaches have been developed to infer unknown quantities affecting animal movement or predict movement based on telemetry data. Hierarchical statistical models are useful to account for some of the aforementioned uncertainties, as well as provide population-level inference, but they often come with an increased computational burden. For certain types of statistical models, it is straightforward to provide inference if the latent true animal trajectory is known, but challenging otherwise. In these cases, approaches related to multiple imputation have been employed to account for the uncertainty associated with our knowledge of the latent trajectory. Despite the increasing use of imputation approaches for modeling animal movement, the general sensitivity and accuracy of these methods have not been explored in detail. We provide an introduction to animal movement modeling and describe how imputation approaches may be helpful for certain types of models. We also assess the performance of imputation approaches in two simulation studies. Our simulation studies suggests that inference for model parameters directly related to the location of an individual may be more accurate than inference for parameters associated with higher-order processes such as velocity or acceleration. Finally, we apply these methods to analyze a telemetry data set involving northern fur seals (Callorhinus ursinus) in the Bering Sea. Supplementary materials accompanying this paper appear online.

  8. Illustrating the practice of statistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamada, Christina A; Hamada, Michael S

    2009-01-01

    The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem andmore » incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.« less

  9. Subject-enabled analytics model on measurement statistics in health risk expert system for public health informatics.

    PubMed

    Chung, Chi-Jung; Kuo, Yu-Chen; Hsieh, Yun-Yu; Li, Tsai-Chung; Lin, Cheng-Chieh; Liang, Wen-Miin; Liao, Li-Na; Li, Chia-Ing; Lin, Hsueh-Chun

    2017-11-01

    This study applied open source technology to establish a subject-enabled analytics model that can enhance measurement statistics of case studies with the public health data in cloud computing. The infrastructure of the proposed model comprises three domains: 1) the health measurement data warehouse (HMDW) for the case study repository, 2) the self-developed modules of online health risk information statistics (HRIStat) for cloud computing, and 3) the prototype of a Web-based process automation system in statistics (PASIS) for the health risk assessment of case studies with subject-enabled evaluation. The system design employed freeware including Java applications, MySQL, and R packages to drive a health risk expert system (HRES). In the design, the HRIStat modules enforce the typical analytics methods for biomedical statistics, and the PASIS interfaces enable process automation of the HRES for cloud computing. The Web-based model supports both modes, step-by-step analysis and auto-computing process, respectively for preliminary evaluation and real time computation. The proposed model was evaluated by computing prior researches in relation to the epidemiological measurement of diseases that were caused by either heavy metal exposures in the environment or clinical complications in hospital. The simulation validity was approved by the commercial statistics software. The model was installed in a stand-alone computer and in a cloud-server workstation to verify computing performance for a data amount of more than 230K sets. Both setups reached efficiency of about 10 5 sets per second. The Web-based PASIS interface can be used for cloud computing, and the HRIStat module can be flexibly expanded with advanced subjects for measurement statistics. The analytics procedure of the HRES prototype is capable of providing assessment criteria prior to estimating the potential risk to public health. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS

    PubMed Central

    Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.

    2016-01-01

    Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560

  11. Seasonal trend analysis and ARIMA modeling of relative humidity and wind speed time series around Yamula Dam

    NASA Astrophysics Data System (ADS)

    Eymen, Abdurrahman; Köylü, Ümran

    2018-02-01

    Local climate change is determined by analysis of long-term recorded meteorological data. In the statistical analysis of the meteorological data, the Mann-Kendall rank test, which is one of the non-parametrical tests, has been used; on the other hand, for determining the power of the trend, Theil-Sen method has been used on the data obtained from 16 meteorological stations. The stations cover the provinces of Kayseri, Sivas, Yozgat, and Nevşehir in the Central Anatolia region of Turkey. Changes in land-use affect local climate. Dams are structures that cause major changes on the land. Yamula Dam is located 25 km northwest of Kayseri. The dam has huge water body which is approximately 85 km2. The mentioned tests have been used for detecting the presence of any positive or negative trend in meteorological data. The meteorological data in relation to the seasonal average, maximum, and minimum values of the relative humidity and seasonal average wind speed have been organized as time series and the tests have been conducted accordingly. As a result of these tests, the following have been identified: increase was observed in minimum relative humidity values in the spring, summer, and autumn seasons. As for the seasonal average wind speed, decrease was detected for nine stations in all seasons, whereas increase was observed in four stations. After the trend analysis, pre-dam mean relative humidity time series were modeled with Autoregressive Integrated Moving Averages (ARIMA) model which is statistical modeling tool. Post-dam relative humidity values were predicted by ARIMA models.

  12. Limited-information goodness-of-fit testing of diagnostic classification item response models.

    PubMed

    Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen

    2016-11-01

    Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.

  13. A new in silico classification model for ready biodegradability, based on molecular fragments.

    PubMed

    Lombardo, Anna; Pizzo, Fabiola; Benfenati, Emilio; Manganaro, Alberto; Ferrari, Thomas; Gini, Giuseppina

    2014-08-01

    Regulations such as the European REACH (Registration, Evaluation, Authorization and restriction of Chemicals) often require chemicals to be evaluated for ready biodegradability, to assess the potential risk for environmental and human health. Because not all chemicals can be tested, there is an increasing demand for tools for quick and inexpensive biodegradability screening, such as computer-based (in silico) theoretical models. We developed an in silico model starting from a dataset of 728 chemicals with ready biodegradability data (MITI-test Ministry of International Trade and Industry). We used the novel software SARpy to automatically extract, through a structural fragmentation process, a set of substructures statistically related to ready biodegradability. Then, we analysed these substructures in order to build some general rules. The model consists of a rule-set made up of the combination of the statistically relevant fragments and of the expert-based rules. The model gives good statistical performance with 92%, 82% and 76% accuracy on the training, test and external set respectively. These results are comparable with other in silico models like BIOWIN developed by the United States Environmental Protection Agency (EPA); moreover this new model includes an easily understandable explanation. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. On the Benefits of Latent Variable Modeling for Norming Scales: The Case of the "Supports Intensity Scale--Children's Version"

    ERIC Educational Resources Information Center

    Seo, Hyojeong; Little, Todd D.; Shogren, Karrie A.; Lang, Kyle M.

    2016-01-01

    Structural equation modeling (SEM) is a powerful and flexible analytic tool to model latent constructs and their relations with observed variables and other constructs. SEM applications offer advantages over classical models in dealing with statistical assumptions and in adjusting for measurement error. So far, however, SEM has not been fully used…

  15. Syndromic surveillance models using Web data: the case of scarlet fever in the UK.

    PubMed

    Samaras, Loukas; García-Barriocanal, Elena; Sicilia, Miguel-Angel

    2012-03-01

    Recent research has shown the potential of Web queries as a source for syndromic surveillance, and existing studies show that these queries can be used as a basis for estimation and prediction of the development of a syndromic disease, such as influenza, using log linear (logit) statistical models. Two alternative models are applied to the relationship between cases and Web queries in this paper. We examine the applicability of using statistical methods to relate search engine queries with scarlet fever cases in the UK, taking advantage of tools to acquire the appropriate data from Google, and using an alternative statistical method based on gamma distributions. The results show that using logit models, the Pearson correlation factor between Web queries and the data obtained from the official agencies must be over 0.90, otherwise the prediction of the peak and the spread of the distributions gives significant deviations. In this paper, we describe the gamma distribution model and show that we can obtain better results in all cases using gamma transformations, and especially in those with a smaller correlation factor.

  16. Dynamic causal modelling: a critical review of the biophysical and statistical foundations.

    PubMed

    Daunizeau, J; David, O; Stephan, K E

    2011-09-15

    The goal of dynamic causal modelling (DCM) of neuroimaging data is to study experimentally induced changes in functional integration among brain regions. This requires (i) biophysically plausible and physiologically interpretable models of neuronal network dynamics that can predict distributed brain responses to experimental stimuli and (ii) efficient statistical methods for parameter estimation and model comparison. These two key components of DCM have been the focus of more than thirty methodological articles since the seminal work of Friston and colleagues published in 2003. In this paper, we provide a critical review of the current state-of-the-art of DCM. We inspect the properties of DCM in relation to the most common neuroimaging modalities (fMRI and EEG/MEG) and the specificity of inference on neural systems that can be made from these data. We then discuss both the plausibility of the underlying biophysical models and the robustness of the statistical inversion techniques. Finally, we discuss potential extensions of the current DCM framework, such as stochastic DCMs, plastic DCMs and field DCMs. Copyright © 2009 Elsevier Inc. All rights reserved.

  17. Machine learning methods reveal the temporal pattern of dengue incidence using meteorological factors in metropolitan Manila, Philippines.

    PubMed

    Carvajal, Thaddeus M; Viacrusis, Katherine M; Hernandez, Lara Fides T; Ho, Howell T; Amalin, Divina M; Watanabe, Kozo

    2018-04-17

    Several studies have applied ecological factors such as meteorological variables to develop models and accurately predict the temporal pattern of dengue incidence or occurrence. With the vast amount of studies that investigated this premise, the modeling approaches differ from each study and only use a single statistical technique. It raises the question of whether which technique would be robust and reliable. Hence, our study aims to compare the predictive accuracy of the temporal pattern of Dengue incidence in Metropolitan Manila as influenced by meteorological factors from four modeling techniques, (a) General Additive Modeling, (b) Seasonal Autoregressive Integrated Moving Average with exogenous variables (c) Random Forest and (d) Gradient Boosting. Dengue incidence and meteorological data (flood, precipitation, temperature, southern oscillation index, relative humidity, wind speed and direction) of Metropolitan Manila from January 1, 2009 - December 31, 2013 were obtained from respective government agencies. Two types of datasets were used in the analysis; observed meteorological factors (MF) and its corresponding delayed or lagged effect (LG). After which, these datasets were subjected to the four modeling techniques. The predictive accuracy and variable importance of each modeling technique were calculated and evaluated. Among the statistical modeling techniques, Random Forest showed the best predictive accuracy. Moreover, the delayed or lag effects of the meteorological variables was shown to be the best dataset to use for such purpose. Thus, the model of Random Forest with delayed meteorological effects (RF-LG) was deemed the best among all assessed models. Relative humidity was shown to be the top-most important meteorological factor in the best model. The study exhibited that there are indeed different predictive outcomes generated from each statistical modeling technique and it further revealed that the Random forest model with delayed meteorological effects to be the best in predicting the temporal pattern of Dengue incidence in Metropolitan Manila. It is also noteworthy that the study also identified relative humidity as an important meteorological factor along with rainfall and temperature that can influence this temporal pattern.

  18. SPSS and SAS programming for the testing of mediation models.

    PubMed

    Dudley, William N; Benuzillo, Jose G; Carrico, Mineh S

    2004-01-01

    Mediation modeling can explain the nature of the relation among three or more variables. In addition, it can be used to show how a variable mediates the relation between levels of intervention and outcome. The Sobel test, developed in 1990, provides a statistical method for determining the influence of a mediator on an intervention or outcome. Although interactive Web-based and stand-alone methods exist for computing the Sobel test, SPSS and SAS programs that automatically run the required regression analyses and computations increase the accessibility of mediation modeling to nursing researchers. To illustrate the utility of the Sobel test and to make this programming available to the Nursing Research audience in both SAS and SPSS. The history, logic, and technical aspects of mediation testing are introduced. The syntax files sobel.sps and sobel.sas, created to automate the computation of the regression analysis and test statistic, are available from the corresponding author. The reported programming allows the user to complete mediation testing with the user's own data in a single-step fashion. A technical manual included with the programming provides instruction on program use and interpretation of the output. Mediation modeling is a useful tool for describing the relation between three or more variables. Programming and manuals for using this model are made available.

  19. Component Models for Fuzzy Data

    ERIC Educational Resources Information Center

    Coppi, Renato; Giordani, Paolo; D'Urso, Pierpaolo

    2006-01-01

    The fuzzy perspective in statistical analysis is first illustrated with reference to the "Informational Paradigm" allowing us to deal with different types of uncertainties related to the various informational ingredients (data, model, assumptions). The fuzzy empirical data are then introduced, referring to "J" LR fuzzy variables as observed on "I"…

  20. Computational modeling of driver speed control with its applications in developing intelligent transportation system to prevent speeding-related accidents.

    DOT National Transportation Integrated Search

    2013-08-01

    Speeding is the leading contributing factor in fatal accidents in NY state, according to NY State Department of Motor : Vehicle Accidents Statistical Summary (2009). Understanding and modeling speeding and speed control is one of major : challenges i...

  1. Two-dimensional random surface model for asperity-contact in elastohydrodynamic lubrication

    NASA Technical Reports Server (NTRS)

    Coy, J. J.; Sidik, S. M.

    1979-01-01

    Relations for the asperity-contact time function during elastohydrodynamic lubrication of a ball bearing are presented. The analysis is based on a two-dimensional random surface model, and actual profile traces of the bearing surfaces are used as statistical sample records. The results of the analysis show that transition from 90 percent contact to 1 percent contact occurs within a dimensionless film thickness range of approximately four to five. This thickness ratio is several times large than reported in the literature where one-dimensional random surface models were used. It is shown that low pass filtering of the statistical records will bring agreement between the present results and those in the literature.

  2. Statistics of voids in hierarchical universes

    NASA Technical Reports Server (NTRS)

    Fry, J. N.

    1986-01-01

    As one alternative to the N-point galaxy correlation function statistics, the distribution of holes or the probability that a volume of given size and shape be empty of galaxies can be considered. The probability of voids resulting from a variety of hierarchical patterns of clustering is considered, and these are compared with the results of numerical simulations and with observations. A scaling relation required by the hierarchical pattern of higher order correlation functions is seen to be obeyed in the simulations, and the numerical results show a clear difference between neutrino models and cold-particle models; voids are more likely in neutrino universes. Observational data do not yet distinguish but are close to being able to distinguish between models.

  3. Experience and Sentence Processing: Statistical Learning and Relative Clause Comprehension

    PubMed Central

    Wells, Justine B.; Christiansen, Morten H.; Race, David S.; Acheson, Daniel J.; MacDonald, Maryellen C.

    2009-01-01

    Many explanations of the difficulties associated with interpreting object relative clauses appeal to the demands that object relatives make on working memory. MacDonald and Christiansen (2002) pointed to variations in reading experience as a source of differences, arguing that the unique word order of object relatives makes their processing more difficult and more sensitive to the effects of previous experience than the processing of subject relatives. This hypothesis was tested in a large-scale study manipulating reading experiences of adults over several weeks. The group receiving relative clause experience increased reading speeds for object relatives more than for subject relatives, whereas a control experience group did not. The reading time data were compared to performance of a computational model given different amounts of experience. The results support claims for experience-based individual differences and an important role for statistical learning in sentence comprehension processes. PMID:18922516

  4. Statistical alignment: computational properties, homology testing and goodness-of-fit.

    PubMed

    Hein, J; Wiuf, C; Knudsen, B; Møller, M B; Wibling, G

    2000-09-08

    The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum likelihood estimate. In addition, the recursions originally presented by Thorne, Kishino and Felsenstein can be simplified. Two proteins, about 1500 amino acids long, can be analysed with this method in less than five seconds on a fast desktop computer, which makes this method practical for actual data analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test, that allows testing the proposed insertion-deletion (indel) process inherent to this model and find that real sequences (here globins) probably experience indels longer than one, contrary to what is assumed by the model. Copyright 2000 Academic Press.

  5. Sensation seeking and smoking behaviors among adolescents in the Republic of Korea.

    PubMed

    Hwang, Heejin; Park, Sunhee

    2015-06-01

    This study aimed to explore the relationship between the four components of sensation seeking (i.e., disinhibition, thrill and adventure seeking, experience seeking, and boredom susceptibility) and three types of smoking behavior (i.e., non-smoking, experimental smoking, and current smoking) among high school students in the Republic of Korea. Multivariate multinomial logistic regression analysis was performed using two models. In Model 1, the four subscales of sensation seeking were used as covariates, and in Model 2, other control factors (i.e., characteristics related to demographics, individuals, family, school, and friends) were added to Model 1 in order to adjust for their effects. In Model 1, the impact of disinhibition on experimental smoking and current smoking was statistically significant. In Model 2, the influence of disinhibition on both of these smoking behaviors remained statistically significant after controlling for all the other covariates. Also, the effect of thrill and adventure seeking on experimental smoking was statistically significant. The two statistically significant subscales of sensation seeking were positively associated with the risk of smoking behaviors. According to extant literature and current research, sensation seeking, particularly disinhibition, is strongly associated with smoking among youth. Therefore, sensation seeking should be measured among adolescents to identify those who are at greater risk of smoking and to develop more effective intervention strategies in order to curb the smoking epidemic among youth. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. On the Relationship between Molecular Hit Rates in High-Throughput Screening and Molecular Descriptors.

    PubMed

    Hansson, Mari; Pemberton, John; Engkvist, Ola; Feierberg, Isabella; Brive, Lars; Jarvis, Philip; Zander-Balderud, Linda; Chen, Hongming

    2014-06-01

    High-throughput screening (HTS) is widely used in the pharmaceutical industry to identify novel chemical starting points for drug discovery projects. The current study focuses on the relationship between molecular hit rate in recent in-house HTS and four common molecular descriptors: lipophilicity (ClogP), size (heavy atom count, HEV), fraction of sp(3)-hybridized carbons (Fsp3), and fraction of molecular framework (f(MF)). The molecular hit rate is defined as the fraction of times the molecule has been assigned as active in the HTS campaigns where it has been screened. Beta-binomial statistical models were built to model the molecular hit rate as a function of these descriptors. The advantage of the beta-binomial statistical models is that the correlation between the descriptors is taken into account. Higher degree polynomial terms of the descriptors were also added into the beta-binomial statistic model to improve the model quality. The relative influence of different molecular descriptors on molecular hit rate has been estimated, taking into account that the descriptors are correlated to each other through applying beta-binomial statistical modeling. The results show that ClogP has the largest influence on the molecular hit rate, followed by Fsp3 and HEV. f(MF) has only a minor influence besides its correlation with the other molecular descriptors. © 2013 Society for Laboratory Automation and Screening.

  7. Comparison of parameterized nitric acid rainout rates using a coupled stochastic-photochemical tropospheric model

    NASA Technical Reports Server (NTRS)

    Stewart, Richard W.; Thompson, Anne M.; Owens, Melody A.; Herwehe, Jerold A.

    1989-01-01

    A major tropospheric loss of soluble species such as nitric acid results from scavenging by water droplets. Several theoretical formulations have been advanced which relate an effective time-independent loss rate for soluble species to statistical properties of precipitation such as the wet fraction and length of a precipitation cycle. In this paper, various 'effective' loss rates that have been proposed are compared with the results of detailed time-dependent model calculations carried out over a seasonal time scale. The model is a stochastic precipitation model coupled to a tropospheric photochemical model. The results of numerous time-dependent seasonal model runs are used to derive numerical values for the nitric acid residence time for several assumed sets of preciptation statistics. These values are then compared with the results obtained by utilizing theoretical 'effective' loss rates in time-independent models.

  8. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  9. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable.

    PubMed

    Austin, Peter C; Steyerberg, Ewout W

    2012-06-20

    When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.

  10. Bird-landscape relations in the Chihuahuan Desert: Coping with uncertainties about predictive models

    USGS Publications Warehouse

    Gutzwiller, K.J.; Barrow, W.C.

    2001-01-01

    During the springs of 1995-1997, we studied birds and landscapes in the Chihuahuan Desert along part of the Texas-Mexico border. Our objectives were to assess bird-landscape relations and their interannual consistency and to identify ways to cope with associated uncertainties that undermine confidence in using such relations in conservation decision processes. Bird distributions were often significantly associated with landscape features, and many bird-landscape models were valid and useful for predictive purposes. Differences in early spring rainfall appeared to influence bird abundance, but there was no evidence that annual differences in bird abundance affected model consistency. Model consistency for richness (42%) was higher than mean model consistency for 26 focal species (mean 30%, range 0-67%), suggesting that relations involving individual species are, on average, more subject to factors that cause variation than are richness-landscape relations. Consistency of bird-landscape relations may be influenced by such factors as plant succession, exotic species invasion, bird species' tolerances for environmental variation, habitat occupancy patterns, and variation in food density or weather. The low model consistency that we observed for most species indicates the high variation in bird-landscape relations that managers and other decision makers may encounter. The uncertainty of interannual variation in bird-landscape relations can be reduced by using projections of bird distributions from different annual models to determine the likely range of temporal and spatial variation in a species' distribution. Stochastic simulation models can be used to incorporate the uncertainty of random environmental variation into predictions of bird distributions based on bird-landscape relations and to provide probabilistic projections with which managers can weigh the costs and benefits of various decisions, Uncertainty about the true structure of bird-landscape relations (structural uncertainty) can be reduced by ensuring that models meet important statistical assumptions, designing studies with sufficient statistical power, validating the predictive ability of models, and improving model accuracy through continued field sampling and model fitting. Un certainty associated with sampling variation (partial observability) can be reduced by ensuring that sample sizes are large enough to provide precise estimates of both bird and landscape parameters. By decreasing the uncertainty due to partial observability, managers will improve their ability to reduce structural uncertainty.

  11. The statistical evaluation and comparison of ADMS-Urban model for the prediction of nitrogen dioxide with air quality monitoring network.

    PubMed

    Dėdelė, Audrius; Miškinytė, Auksė

    2015-09-01

    In many countries, road traffic is one of the main sources of air pollution associated with adverse effects on human health and environment. Nitrogen dioxide (NO2) is considered to be a measure of traffic-related air pollution, with concentrations tending to be higher near highways, along busy roads, and in the city centers, and the exceedances are mainly observed at measurement stations located close to traffic. In order to assess the air quality in the city and the air pollution impact on public health, air quality models are used. However, firstly, before the model can be used for these purposes, it is important to evaluate the accuracy of the dispersion modelling as one of the most widely used method. The monitoring and dispersion modelling are two components of air quality monitoring system (AQMS), in which statistical comparison was made in this research. The evaluation of the Atmospheric Dispersion Modelling System (ADMS-Urban) was made by comparing monthly modelled NO2 concentrations with the data of continuous air quality monitoring stations in Kaunas city. The statistical measures of model performance were calculated for annual and monthly concentrations of NO2 for each monitoring station site. The spatial analysis was made using geographic information systems (GIS). The calculation of statistical parameters indicated a good ADMS-Urban model performance for the prediction of NO2. The results of this study showed that the agreement of modelled values and observations was better for traffic monitoring stations compared to the background and residential stations.

  12. Students' Emergent Articulations of Statistical Models and Modeling in Making Informal Statistical Inferences

    ERIC Educational Resources Information Center

    Braham, Hana Manor; Ben-Zvi, Dani

    2017-01-01

    A fundamental aspect of statistical inference is representation of real-world data using statistical models. This article analyzes students' articulations of statistical models and modeling during their first steps in making informal statistical inferences. An integrated modeling approach (IMA) was designed and implemented to help students…

  13. Incorporating GIS and remote sensing for census population disaggregation

    NASA Astrophysics Data System (ADS)

    Wu, Shuo-Sheng'derek'

    Census data are the primary source of demographic data for a variety of researches and applications. For confidentiality issues and administrative purposes, census data are usually released to the public by aggregated areal units. In the United States, the smallest census unit is census blocks. Due to data aggregation, users of census data may have problems in visualizing population distribution within census blocks and estimating population counts for areas not coinciding with census block boundaries. The main purpose of this study is to develop methodology for estimating sub-block areal populations and assessing the estimation errors. The City of Austin, Texas was used as a case study area. Based on tax parcel boundaries and parcel attributes derived from ancillary GIS and remote sensing data, detailed urban land use classes were first classified using a per-field approach. After that, statistical models by land use classes were built to infer population density from other predictor variables, including four census demographic statistics (the Hispanic percentage, the married percentage, the unemployment rate, and per capita income) and three physical variables derived from remote sensing images and building footprints vector data (a landscape heterogeneity statistics, a building pattern statistics, and a building volume statistics). In addition to statistical models, deterministic models were proposed to directly infer populations from building volumes and three housing statistics, including the average space per housing unit, the housing unit occupancy rate, and the average household size. After population models were derived or proposed, how well the models predict populations for another set of sample blocks was assessed. The results show that deterministic models were more accurate than statistical models. Further, by simulating the base unit for modeling from aggregating blocks, I assessed how well the deterministic models estimate sub-unit-level populations. I also assessed the aggregation effects and the resealing effects on sub-unit estimates. Lastly, from another set of mixed-land-use sample blocks, a mixed-land-use model was derived and compared with a residential-land-use model. The results of per-field land use classification are satisfactory with a Kappa accuracy statistics of 0.747. Model Assessments by land use show that population estimates for multi-family land use areas have higher errors than those for single-family land use areas, and population estimates for mixed land use areas have higher errors than those for residential land use areas. The assessments of sub-unit estimates using a simulation approach indicate that smaller areas show higher estimation errors, estimation errors do not relate to the base unit size, and resealing improves all levels of sub-unit estimates.

  14. Relating triggering processes in lab experiments with earthquakes.

    NASA Astrophysics Data System (ADS)

    Baro Urbea, J.; Davidsen, J.; Kwiatek, G.; Charalampidou, E. M.; Goebel, T.; Stanchits, S. A.; Vives, E.; Dresen, G.

    2016-12-01

    Statistical relations such as Gutenberg-Richter's, Omori-Utsu's and the productivity of aftershocks were first observed in seismology, but are also common to other physical phenomena exhibiting avalanche dynamics such as solar flares, rock fracture, structural phase transitions and even stock market transactions. All these examples exhibit spatio-temporal correlations that can be explained as triggering processes: Instead of being activated as a response to external driving or fluctuations, some events are consequence of previous activity. Although different plausible explanations have been suggested in each system, the ubiquity of such statistical laws remains unknown. However, the case of rock fracture may exhibit a physical connection with seismology. It has been suggested that some features of seismology have a microscopic origin and are reproducible over a vast range of scales. This hypothesis has motivated mechanical experiments to generate artificial catalogues of earthquakes at a laboratory scale -so called labquakes- and under controlled conditions. Microscopic fractures in lab tests release elastic waves that are recorded as ultrasonic (kHz-MHz) acoustic emission (AE) events by means of piezoelectric transducers. Here, we analyse the statistics of labquakes recorded during the failure of small samples of natural rocks and artificial porous materials under different controlled compression regimes. Temporal and spatio-temporal correlations are identified in certain cases. Specifically, we distinguish between the background and triggered events, revealing some differences in the statistical properties. We fit the data to statistical models of seismicity. As a particular case, we explore the branching process approach simplified in the Epidemic Type Aftershock Sequence (ETAS) model. We evaluate the empirical spatio-temporal kernel of the model and investigate the physical origins of triggering. Our analysis of the focal mechanisms implies that the occurrence of the empirical laws extends well beyond purely frictional sliding events, in contrast to what is often assumed.

  15. OPR-PPR, a Computer Program for Assessing Data Importance to Model Predictions Using Linear Statistics

    USGS Publications Warehouse

    Tonkin, Matthew J.; Tiedeman, Claire; Ely, D. Matthew; Hill, Mary C.

    2007-01-01

    The OPR-PPR program calculates the Observation-Prediction (OPR) and Parameter-Prediction (PPR) statistics that can be used to evaluate the relative importance of various kinds of data to simulated predictions. The data considered fall into three categories: (1) existing observations, (2) potential observations, and (3) potential information about parameters. The first two are addressed by the OPR statistic; the third is addressed by the PPR statistic. The statistics are based on linear theory and measure the leverage of the data, which depends on the location, the type, and possibly the time of the data being considered. For example, in a ground-water system the type of data might be a head measurement at a particular location and time. As a measure of leverage, the statistics do not take into account the value of the measurement. As linear measures, the OPR and PPR statistics require minimal computational effort once sensitivities have been calculated. Sensitivities need to be calculated for only one set of parameter values; commonly these are the values estimated through model calibration. OPR-PPR can calculate the OPR and PPR statistics for any mathematical model that produces the necessary OPR-PPR input files. In this report, OPR-PPR capabilities are presented in the context of using the ground-water model MODFLOW-2000 and the universal inverse program UCODE_2005. The method used to calculate the OPR and PPR statistics is based on the linear equation for prediction standard deviation. Using sensitivities and other information, OPR-PPR calculates (a) the percent increase in the prediction standard deviation that results when one or more existing observations are omitted from the calibration data set; (b) the percent decrease in the prediction standard deviation that results when one or more potential observations are added to the calibration data set; or (c) the percent decrease in the prediction standard deviation that results when potential information on one or more parameters is added.

  16. ENSO's non-stationary and non-Gaussian character: the role of climate shifts

    NASA Astrophysics Data System (ADS)

    Boucharel, J.; Dewitte, B.; Garel, B.; Du Penhoat, Y.

    2009-07-01

    El Niño Southern Oscillation (ENSO) is the dominant mode of climate variability in the Pacific, having socio-economic impacts on surrounding regions. ENSO exhibits significant modulation on decadal to inter-decadal time scales which is related to changes in its characteristics (onset, amplitude, frequency, propagation, and predictability). Some of these characteristics tend to be overlooked in ENSO studies, such as its asymmetry (the number and amplitude of warm and cold events are not equal) and the deviation of its statistics from those of the Gaussian distribution. These properties could be related to the ability of the current generation of coupled models to predict ENSO and its modulation. Here, ENSO's non-Gaussian nature and asymmetry are diagnosed from in situ data and a variety of models (from intermediate complexity models to full-physics coupled general circulation models (CGCMs)) using robust statistical tools initially designed for financial mathematics studies. In particular α-stable laws are used as theoretical background material to measure (and quantify) the non-Gaussian character of ENSO time series and to estimate the skill of ``naïve'' statistical models in producing deviation from Gaussian laws and asymmetry. The former are based on non-stationary processes dominated by abrupt changes in mean state and empirical variance. It is shown that the α-stable character of ENSO may result from the presence of climate shifts in the time series. Also, cool (warm) periods are associated with ENSO statistics having a stronger (weaker) tendency towards Gaussianity and lower (greater) asymmetry. This supports the hypothesis of ENSO being rectified by changes in mean state through nonlinear processes. The relationship between changes in mean state and nonlinearity (skewness) is further investigated both in the Zebiak and Cane (1987)'s model and the models of the Intergovernmental Panel for Climate Change (IPCC). Whereas there is a clear relationship in all models between ENSO asymmetry (as measured by skewness or nonlinear advection) and changes in mean state, they exhibit a variety of behaviour with regard to α-stability. This suggests that the dynamics associated with climate shifts and the occurrence of extreme events involve higher-order statistical moments that cannot be accounted for solely by nonlinear advection.

  17. The MAX Statistic is Less Powerful for Genome Wide Association Studies Under Most Alternative Hypotheses.

    PubMed

    Shifflett, Benjamin; Huang, Rong; Edland, Steven D

    2017-01-01

    Genotypic association studies are prone to inflated type I error rates if multiple hypothesis testing is performed, e.g., sequentially testing for recessive, multiplicative, and dominant risk. Alternatives to multiple hypothesis testing include the model independent genotypic χ 2 test, the efficiency robust MAX statistic, which corrects for multiple comparisons but with some loss of power, or a single Armitage test for multiplicative trend, which has optimal power when the multiplicative model holds but with some loss of power when dominant or recessive models underlie the genetic association. We used Monte Carlo simulations to describe the relative performance of these three approaches under a range of scenarios. All three approaches maintained their nominal type I error rates. The genotypic χ 2 and MAX statistics were more powerful when testing a strictly recessive genetic effect or when testing a dominant effect when the allele frequency was high. The Armitage test for multiplicative trend was most powerful for the broad range of scenarios where heterozygote risk is intermediate between recessive and dominant risk. Moreover, all tests had limited power to detect recessive genetic risk unless the sample size was large, and conversely all tests were relatively well powered to detect dominant risk. Taken together, these results suggest the general utility of the multiplicative trend test when the underlying genetic model is unknown.

  18. Modeling stimulus variation in three common implicit attitude tasks.

    PubMed

    Wolsiefer, Katie; Westfall, Jacob; Judd, Charles M

    2017-08-01

    We explored the consequences of ignoring the sampling variation due to stimuli in the domain of implicit attitudes. A large literature in psycholinguistics has examined the statistical treatment of random stimulus materials, but the recommendations from this literature have not been applied to the social psychological literature on implicit attitudes. This is partly because of inherent complications in applying crossed random-effect models to some of the most common implicit attitude tasks, and partly because no work to date has demonstrated that random stimulus variation is in fact consequential in implicit attitude measurement. We addressed this problem by laying out statistically appropriate and practically feasible crossed random-effect models for three of the most commonly used implicit attitude measures-the Implicit Association Test, affect misattribution procedure, and evaluative priming task-and then applying these models to large datasets (average N = 3,206) that assess participants' implicit attitudes toward race, politics, and self-esteem. We showed that the test statistics from the traditional analyses are substantially (about 60 %) inflated relative to the more-appropriate analyses that incorporate stimulus variation. Because all three tasks used the same stimulus words and faces, we could also meaningfully compare the relative contributions of stimulus variation across the tasks. In an appendix, we give syntax in R, SAS, and SPSS for fitting the recommended crossed random-effects models to data from all three tasks, as well as instructions on how to structure the data file.

  19. Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo

    NASA Astrophysics Data System (ADS)

    Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik

    2016-03-01

    A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.

  20. Effects of Heterogeniety on Spatial Pattern Analysis of Wild Pistachio Trees in Zagros Woodlands, Iran

    NASA Astrophysics Data System (ADS)

    Erfanifard, Y.; Rezayan, F.

    2014-10-01

    Vegetation heterogeneity biases second-order summary statistics, e.g., Ripley's K-function, applied for spatial pattern analysis in ecology. Second-order investigation based on Ripley's K-function and related statistics (i.e., L- and pair correlation function g) is widely used in ecology to develop hypothesis on underlying processes by characterizing spatial patterns of vegetation. The aim of this study was to demonstrate effects of underlying heterogeneity of wild pistachio (Pistacia atlantica Desf.) trees on the second-order summary statistics of point pattern analysis in a part of Zagros woodlands, Iran. The spatial distribution of 431 wild pistachio trees was accurately mapped in a 40 ha stand in the Wild Pistachio & Almond Research Site, Fars province, Iran. Three commonly used second-order summary statistics (i.e., K-, L-, and g-functions) were applied to analyse their spatial pattern. The two-sample Kolmogorov-Smirnov goodness-of-fit test showed that the observed pattern significantly followed an inhomogeneous Poisson process null model in the study region. The results also showed that heterogeneous pattern of wild pistachio trees biased the homogeneous form of K-, L-, and g-functions, demonstrating a stronger aggregation of the trees at the scales of 0-50 m than actually existed and an aggregation at scales of 150-200 m, while regularly distributed. Consequently, we showed that heterogeneity of point patterns may bias the results of homogeneous second-order summary statistics and we also suggested applying inhomogeneous summary statistics with related null models for spatial pattern analysis of heterogeneous vegetations.

  1. An Empirical Comparison of Selected Two-Sample Hypothesis Testing Procedures Which Are Locally Most Powerful Under Certain Conditions.

    ERIC Educational Resources Information Center

    Hoover, H. D.; Plake, Barbara

    The relative power of the Mann-Whitney statistic, the t-statistic, the median test, a test based on exceedances (A,B), and two special cases of (A,B) the Tukey quick test and the revised Tukey quick test, was investigated via a Monte Carlo experiment. These procedures were compared across four population probability models: uniform, beta, normal,…

  2. Detection of crossover time scales in multifractal detrended fluctuation analysis

    NASA Astrophysics Data System (ADS)

    Ge, Erjia; Leung, Yee

    2013-04-01

    Fractal is employed in this paper as a scale-based method for the identification of the scaling behavior of time series. Many spatial and temporal processes exhibiting complex multi(mono)-scaling behaviors are fractals. One of the important concepts in fractals is crossover time scale(s) that separates distinct regimes having different fractal scaling behaviors. A common method is multifractal detrended fluctuation analysis (MF-DFA). The detection of crossover time scale(s) is, however, relatively subjective since it has been made without rigorous statistical procedures and has generally been determined by eye balling or subjective observation. Crossover time scales such determined may be spurious and problematic. It may not reflect the genuine underlying scaling behavior of a time series. The purpose of this paper is to propose a statistical procedure to model complex fractal scaling behaviors and reliably identify the crossover time scales under MF-DFA. The scaling-identification regression model, grounded on a solid statistical foundation, is first proposed to describe multi-scaling behaviors of fractals. Through the regression analysis and statistical inference, we can (1) identify the crossover time scales that cannot be detected by eye-balling observation, (2) determine the number and locations of the genuine crossover time scales, (3) give confidence intervals for the crossover time scales, and (4) establish the statistically significant regression model depicting the underlying scaling behavior of a time series. To substantive our argument, the regression model is applied to analyze the multi-scaling behaviors of avian-influenza outbreaks, water consumption, daily mean temperature, and rainfall of Hong Kong. Through the proposed model, we can have a deeper understanding of fractals in general and a statistical approach to identify multi-scaling behavior under MF-DFA in particular.

  3. Weather extremes in very large, high-resolution ensembles: the weatherathome experiment

    NASA Astrophysics Data System (ADS)

    Allen, M. R.; Rosier, S.; Massey, N.; Rye, C.; Bowery, A.; Miller, J.; Otto, F.; Jones, R.; Wilson, S.; Mote, P.; Stone, D. A.; Yamazaki, Y. H.; Carrington, D.

    2011-12-01

    Resolution and ensemble size are often seen as alternatives in climate modelling. Models with sufficient resolution to simulate many classes of extreme weather cannot normally be run often enough to assess the statistics of rare events, still less how these statistics may be changing. As a result, assessments of the impact of external forcing on regional climate extremes must be based either on statistical downscaling from relatively coarse-resolution models, or statistical extrapolation from 10-year to 100-year events. Under the weatherathome experiment, part of the climateprediction.net initiative, we have compiled the Met Office Regional Climate Model HadRM3P to run on personal computer volunteered by the general public at 25 and 50km resolution, embedded within the HadAM3P global atmosphere model. With a global network of about 50,000 volunteers, this allows us to run time-slice ensembles of essentially unlimited size, exploring the statistics of extreme weather under a range of scenarios for surface forcing and atmospheric composition, allowing for uncertainty in both boundary conditions and model parameters. Current experiments, developed with the support of Microsoft Research, focus on three regions, the Western USA, Europe and Southern Africa. We initially simulate the period 1959-2010 to establish which variables are realistically simulated by the model and on what scales. Our next experiments are focussing on the Event Attribution problem, exploring how the probability of various types of extreme weather would have been different over the recent past in a world unaffected by human influence, following the design of Pall et al (2011), but extended to a longer period and higher spatial resolution. We will present the first results of the unique, global, participatory experiment and discuss the implications for the attribution of recent weather events to anthropogenic influence on climate.

  4. Forging a link between mentoring and collaboration: a new training model for implementation science.

    PubMed

    Luke, Douglas A; Baumann, Ana A; Carothers, Bobbi J; Landsverk, John; Proctor, Enola K

    2016-10-13

    Training investigators for the rapidly developing field of implementation science requires both mentoring and scientific collaboration. Using social network descriptive analyses, visualization, and modeling, this paper presents results of an evaluation of the mentoring and collaborations fostered over time through the National Institute of Mental Health (NIMH) supported by Implementation Research Institute (IRI). Data were comprised of IRI participant self-reported collaborations and mentoring relationships, measured in three annual surveys from 2012 to 2014. Network descriptive statistics, visualizations, and network statistical modeling were conducted to examine patterns of mentoring and collaboration among IRI participants and to model the relationship between mentoring and subsequent collaboration. Findings suggest that IRI is successful in forming mentoring relationships among its participants, and that these mentoring relationships are related to future scientific collaborations. Exponential random graph network models demonstrated that mentoring received in 2012 was positively and significantly related to the likelihood of having a scientific collaboration 2 years later in 2014 (p = 0.001). More specifically, mentoring was significantly related to future collaborations focusing on new research (p = 0.009), grant submissions (p = 0.003), and publications (p = 0.017). Predictions based on the network model suggest that for every additional mentoring relationships established in 2012, the likelihood of a scientific collaboration 2 years later is increased by almost 7 %. These results support the importance of mentoring in implementation science specifically and team science more generally. Mentoring relationships were established quickly and early by the IRI core faculty. IRI fellows reported increasing scientific collaboration of all types over time, including starting new research, submitting new grants, presenting research results, and publishing peer-reviewed papers. Statistical network models demonstrated that mentoring was strongly and significantly related to subsequent scientific collaboration, which supported a core design principle of the IRI. Future work should establish the link between mentoring and scientific productivity. These results may be of interest to team science, as they suggest the importance of mentoring for future team collaborations, as well as illustrate the utility of network analysis for studying team characteristics and activities.

  5. Secular Extragalactic Parallax and Geometric Distances with Gaia Proper Motions

    NASA Astrophysics Data System (ADS)

    Paine, Jennie; Darling, Jeremiah K.

    2018-06-01

    The motion of the Solar System with respect to the cosmic microwave background (CMB) rest frame creates a well measured dipole in the CMB, which corresponds to a linear solar velocity of about 78 AU/yr. This motion causes relatively nearby extragalactic objects to appear to move compared to more distant objects, an effect that can be measured in the proper motions of nearby galaxies. An object at 1 Mpc and perpendicular to the CMB apex will exhibit a secular parallax, observed as a proper motion, of 78 µas/yr. The relatively large peculiar motions of galaxies make the detection of secular parallax challenging for individual objects. Instead, a statistical parallax measurement can be made for a sample of objects with proper motions, where the global parallax signal is modeled as an E-mode dipole that diminishes linearly with distance. We present preliminary results of applying this model to a sample of nearby galaxies with Gaia proper motions to detect the statistical secular parallax signal. The statistical measurement can be used to calibrate the canonical cosmological “distance ladder.”

  6. Avalanches and generalized memory associativity in a network model for conscious and unconscious mental functioning

    NASA Astrophysics Data System (ADS)

    Siddiqui, Maheen; Wedemann, Roseli S.; Jensen, Henrik Jeldtoft

    2018-01-01

    We explore statistical characteristics of avalanches associated with the dynamics of a complex-network model, where two modules corresponding to sensorial and symbolic memories interact, representing unconscious and conscious mental processes. The model illustrates Freud's ideas regarding the neuroses and that consciousness is related with symbolic and linguistic memory activity in the brain. It incorporates the Stariolo-Tsallis generalization of the Boltzmann Machine in order to model memory retrieval and associativity. In the present work, we define and measure avalanche size distributions during memory retrieval, in order to gain insight regarding basic aspects of the functioning of these complex networks. The avalanche sizes defined for our model should be related to the time consumed and also to the size of the neuronal region which is activated, during memory retrieval. This allows the qualitative comparison of the behaviour of the distribution of cluster sizes, obtained during fMRI measurements of the propagation of signals in the brain, with the distribution of avalanche sizes obtained in our simulation experiments. This comparison corroborates the indication that the Nonextensive Statistical Mechanics formalism may indeed be more well suited to model the complex networks which constitute brain and mental structure.

  7. On the Land-Ocean Contrast of Tropical Convection and Microphysics Statistics Derived from TRMM Satellite Signals and Global Storm-Resolving Models

    NASA Technical Reports Server (NTRS)

    Matsui, Toshihisa; Chern, Jiun-Dar; Tao, Wei-Kuo; Lang, Stephen E.; Satoh, Masaki; Hashino, Tempei; Kubota, Takuji

    2016-01-01

    A 14-year climatology of Tropical Rainfall Measuring Mission (TRMM) collocated multi-sensor signal statistics reveal a distinct land-ocean contrast as well as geographical variability of precipitation type, intensity, and microphysics. Microphysics information inferred from the TRMM precipitation radar and Microwave Imager (TMI) show a large land-ocean contrast for the deep category, suggesting continental convective vigor. Over land, TRMM shows higher echo-top heights and larger maximum echoes, suggesting taller storms and more intense precipitation, as well as larger microwave scattering, suggesting the presence of morelarger frozen convective hydrometeors. This strong land-ocean contrast in deep convection is invariant over seasonal and multi-year time-scales. Consequently, relatively short-term simulations from two global storm-resolving models can be evaluated in terms of their land-ocean statistics using the TRMM Triple-sensor Three-step Evaluation via a satellite simulator. The models evaluated are the NASA Multi-scale Modeling Framework (MMF) and the Non-hydrostatic Icosahedral Cloud Atmospheric Model (NICAM). While both simulations can represent convective land-ocean contrasts in warm precipitation to some extent, near-surface conditions over land are relatively moisture in NICAM than MMF, which appears to be the key driver in the divergent warm precipitation results between the two models. Both the MMF and NICAM produced similar frequencies of large CAPE between land and ocean. The dry MMF boundary layer enhanced microwave scattering signals over land, but only NICAM had an enhanced deep convection frequency over land. Neither model could reproduce a realistic land-ocean contrast in in deep convective precipitation microphysics. A realistic contrast between land and ocean remains an issue in global storm-resolving modeling.

  8. Rainfall runoff modelling of the Upper Ganga and Brahmaputra basins using PERSiST.

    PubMed

    Futter, M N; Whitehead, P G; Sarkar, S; Rodda, H; Crossman, J

    2015-06-01

    There are ongoing discussions about the appropriate level of complexity and sources of uncertainty in rainfall runoff models. Simulations for operational hydrology, flood forecasting or nutrient transport all warrant different levels of complexity in the modelling approach. More complex model structures are appropriate for simulations of land-cover dependent nutrient transport while more parsimonious model structures may be adequate for runoff simulation. The appropriate level of complexity is also dependent on data availability. Here, we use PERSiST; a simple, semi-distributed dynamic rainfall-runoff modelling toolkit to simulate flows in the Upper Ganges and Brahmaputra rivers. We present two sets of simulations driven by single time series of daily precipitation and temperature using simple (A) and complex (B) model structures based on uniform and hydrochemically relevant land covers respectively. Models were compared based on ensembles of Bayesian Information Criterion (BIC) statistics. Equifinality was observed for parameters but not for model structures. Model performance was better for the more complex (B) structural representations than for parsimonious model structures. The results show that structural uncertainty is more important than parameter uncertainty. The ensembles of BIC statistics suggested that neither structural representation was preferable in a statistical sense. Simulations presented here confirm that relatively simple models with limited data requirements can be used to credibly simulate flows and water balance components needed for nutrient flux modelling in large, data-poor basins.

  9. Optimal filtering and Bayesian detection for friction-based diagnostics in machines.

    PubMed

    Ray, L R; Townsend, J R; Ramasubramanian, A

    2001-01-01

    Non-model-based diagnostic methods typically rely on measured signals that must be empirically related to process behavior or incipient faults. The difficulty in interpreting a signal that is indirectly related to the fundamental process behavior is significant. This paper presents an integrated non-model and model-based approach to detecting when process behavior varies from a proposed model. The method, which is based on nonlinear filtering combined with maximum likelihood hypothesis testing, is applicable to dynamic systems whose constitutive model is well known, and whose process inputs are poorly known. Here, the method is applied to friction estimation and diagnosis during motion control in a rotating machine. A nonlinear observer estimates friction torque in a machine from shaft angular position measurements and the known input voltage to the motor. The resulting friction torque estimate can be analyzed directly for statistical abnormalities, or it can be directly compared to friction torque outputs of an applicable friction process model in order to diagnose faults or model variations. Nonlinear estimation of friction torque provides a variable on which to apply diagnostic methods that is directly related to model variations or faults. The method is evaluated experimentally by its ability to detect normal load variations in a closed-loop controlled motor driven inertia with bearing friction and an artificially-induced external line contact. Results show an ability to detect statistically significant changes in friction characteristics induced by normal load variations over a wide range of underlying friction behaviors.

  10. Addressing economic development goals through innovative teaching of university statistics: a case study of statistical modelling in Nigeria

    NASA Astrophysics Data System (ADS)

    Oseloka Ezepue, Patrick; Ojo, Adegbola

    2012-12-01

    A challenging problem in some developing countries such as Nigeria is inadequate training of students in effective problem solving using the core concepts of their disciplines. Related to this is a disconnection between their learning and socio-economic development agenda of a country. These problems are more vivid in statistical education which is dominated by textbook examples and unbalanced assessment 'for' and 'of' learning within traditional curricula. The problems impede the achievement of socio-economic development objectives such as those stated in the Nigerian Vision 2020 blueprint and United Nations Millennium Development Goals. They also impoverish the ability of (statistics) graduates to creatively use their knowledge in relevant business and industry sectors, thereby exacerbating mass graduate unemployment in Nigeria and similar developing countries. This article uses a case study in statistical modelling to discuss the nature of innovations in statistics education vital to producing new kinds of graduates who can link their learning to national economic development goals, create wealth and alleviate poverty through (self) employment. Wider implications of the innovations for repositioning mathematical sciences education globally are explored in this article.

  11. A statistical approach to quasi-extinction forecasting.

    PubMed

    Holmes, Elizabeth Eli; Sabo, John L; Viscido, Steven Vincent; Fagan, William Fredric

    2007-12-01

    Forecasting population decline to a certain critical threshold (the quasi-extinction risk) is one of the central objectives of population viability analysis (PVA), and such predictions figure prominently in the decisions of major conservation organizations. In this paper, we argue that accurate forecasting of a population's quasi-extinction risk does not necessarily require knowledge of the underlying biological mechanisms. Because of the stochastic and multiplicative nature of population growth, the ensemble behaviour of population trajectories converges to common statistical forms across a wide variety of stochastic population processes. This paper provides a theoretical basis for this argument. We show that the quasi-extinction surfaces of a variety of complex stochastic population processes (including age-structured, density-dependent and spatially structured populations) can be modelled by a simple stochastic approximation: the stochastic exponential growth process overlaid with Gaussian errors. Using simulated and real data, we show that this model can be estimated with 20-30 years of data and can provide relatively unbiased quasi-extinction risk with confidence intervals considerably smaller than (0,1). This was found to be true even for simulated data derived from some of the noisiest population processes (density-dependent feedback, species interactions and strong age-structure cycling). A key advantage of statistical models is that their parameters and the uncertainty of those parameters can be estimated from time series data using standard statistical methods. In contrast for most species of conservation concern, biologically realistic models must often be specified rather than estimated because of the limited data available for all the various parameters. Biologically realistic models will always have a prominent place in PVA for evaluating specific management options which affect a single segment of a population, a single demographic rate, or different geographic areas. However, for forecasting quasi-extinction risk, statistical models that are based on the convergent statistical properties of population processes offer many advantages over biologically realistic models.

  12. Markov modulated Poisson process models incorporating covariates for rainfall intensity.

    PubMed

    Thayakaran, R; Ramesh, N I

    2013-01-01

    Time series of rainfall bucket tip times at the Beaufort Park station, Bracknell, in the UK are modelled by a class of Markov modulated Poisson processes (MMPP) which may be thought of as a generalization of the Poisson process. Our main focus in this paper is to investigate the effects of including covariate information into the MMPP model framework on statistical properties. In particular, we look at three types of time-varying covariates namely temperature, sea level pressure, and relative humidity that are thought to be affecting the rainfall arrival process. Maximum likelihood estimation is used to obtain the parameter estimates, and likelihood ratio tests are employed in model comparison. Simulated data from the fitted model are used to make statistical inferences about the accumulated rainfall in the discrete time interval. Variability of the daily Poisson arrival rates is studied.

  13. Statistical fluctuations of an ocean surface inferred from shoes and ships

    NASA Astrophysics Data System (ADS)

    Lerche, Ian; Maubeuge, Frédéric

    1995-12-01

    This paper shows that it is possible to roughly estimate some ocean properties using simple time-dependent statistical models of ocean fluctuations. Based on a real incident, the loss by a vessel of a Nike shoes container in the North Pacific Ocean, a statistical model was tested on data sets consisting of the Nike shoes found by beachcombers a few months later. This statistical treatment of the shoes' motion allows one to infer velocity trends of the Pacific Ocean, together with their fluctuation strengths. The idea is to suppose that there is a mean bulk flow speed that can depend on location on the ocean surface and time. The fluctuations of the surface flow speed are then treated as statistically random. The distribution of shoes is described in space and time using Markov probability processes related to the mean and fluctuating ocean properties. The aim of the exercise is to provide some of the properties of the Pacific Ocean that are otherwise calculated using a sophisticated numerical model, OSCURS, where numerous data are needed. Relevant quantities are sharply estimated, which can be useful to (1) constrain output results from OSCURS computations, and (2) elucidate the behavior patterns of ocean flow characteristics on long time scales.

  14. Impact of cleaning and other interventions on the reduction of hospital-acquired Clostridium difficile infections in two hospitals in England assessed using a breakpoint model.

    PubMed

    Hughes, G J; Nickerson, E; Enoch, D A; Ahluwalia, J; Wilkinson, C; Ayers, R; Brown, N M

    2013-07-01

    Clostridium difficile infection remains a major challenge for hospitals. Although targeted infection control initiatives have been shown to be effective in reducing the incidence of hospital-acquired C. difficile infection, there is little evidence available to assess the effectiveness of specific interventions. To use statistical modelling to detect substantial reductions in the incidence of C. difficile from time series data from two hospitals in England, and relate these time points to infection control interventions. A statistical breakpoints model was fitted to likely hospital-acquired C. difficile infection incidence data from a teaching hospital (2002-2009) and a district general hospital (2005-2009) in England. Models with increasing complexity (i.e. increasing the number of breakpoints) were tested for an improved fit to the data. Partitions estimated from breakpoint models were tested for individual stability using statistical process control charts. Major infection control interventions from both hospitals during this time were grouped according to their primary target (antibiotics, cleaning, isolation, other) and mapped to the model-suggested breakpoints. For both hospitals, breakpoints coincided with enhancements to cleaning protocols. Statistical models enabled formal assessment of the impact of different interventions, and showed that enhancements to deep cleaning programmes are the interventions that have most likely led to substantial reductions in hospital-acquired C. difficile infections at the two hospitals studied. Copyright © 2013 The Healthcare Infection Society. Published by Elsevier Ltd. All rights reserved.

  15. Forecasting daily source air quality using multivariate statistical analysis and radial basis function networks.

    PubMed

    Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A

    2008-12-01

    It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.

  16. NPS national transit inventory, 2013

    DOT National Transportation Integrated Search

    2014-07-31

    This document summarizes key highlights from the National Park Service (NPS) 2013 National Transit Inventory, and presents data for NPS transit systems system-wide. The document discusses statistics related to ridership, business models, fleet charac...

  17. Mathematical Modelling for Patient Selection in Proton Therapy.

    PubMed

    Mee, T; Kirkby, N F; Kirkby, K J

    2018-05-01

    Proton beam therapy (PBT) is still relatively new in cancer treatment and the clinical evidence base is relatively sparse. Mathematical modelling offers assistance when selecting patients for PBT and predicting the demand for service. Discrete event simulation, normal tissue complication probability, quality-adjusted life-years and Markov Chain models are all mathematical and statistical modelling techniques currently used but none is dominant. As new evidence and outcome data become available from PBT, comprehensive models will emerge that are less dependent on the specific technologies of radiotherapy planning and delivery. Copyright © 2018 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

  18. The effects of sampling frequency on the climate statistics of the European Centre for Medium-Range Weather Forecasts

    NASA Astrophysics Data System (ADS)

    Phillips, Thomas J.; Gates, W. Lawrence; Arpe, Klaus

    1992-12-01

    The effects of sampling frequency on the first- and second-moment statistics of selected European Centre for Medium-Range Weather Forecasts (ECMWF) model variables are investigated in a simulation of "perpetual July" with a diurnal cycle included and with surface and atmospheric fields saved at hourly intervals. The shortest characteristic time scales (as determined by the e-folding time of lagged autocorrelation functions) are those of ground heat fluxes and temperatures, precipitation and runoff, convective processes, cloud properties, and atmospheric vertical motion, while the longest time scales are exhibited by soil temperature and moisture, surface pressure, and atmospheric specific humidity, temperature, and wind. The time scales of surface heat and momentum fluxes and of convective processes are substantially shorter over land than over oceans. An appropriate sampling frequency for each model variable is obtained by comparing the estimates of first- and second-moment statistics determined at intervals ranging from 2 to 24 hours with the "best" estimates obtained from hourly sampling. Relatively accurate estimation of first- and second-moment climate statistics (10% errors in means, 20% errors in variances) can be achieved by sampling a model variable at intervals that usually are longer than the bandwidth of its time series but that often are shorter than its characteristic time scale. For the surface variables, sampling at intervals that are nonintegral divisors of a 24-hour day yields relatively more accurate time-mean statistics because of a reduction in errors associated with aliasing of the diurnal cycle and higher-frequency harmonics. The superior estimates of first-moment statistics are accompanied by inferior estimates of the variance of the daily means due to the presence of systematic biases, but these probably can be avoided by defining a different measure of low-frequency variability. Estimates of the intradiurnal variance of accumulated precipitation and surface runoff also are strongly impacted by the length of the storage interval. In light of these results, several alternative strategies for storage of the EMWF model variables are recommended.

  19. Estimating current and future streamflow characteristics at ungaged sites, central and eastern Montana, with application to evaluating effects of climate change on fish populations

    USGS Publications Warehouse

    Sando, Roy; Chase, Katherine J.

    2017-03-23

    A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.

  20. Intensity changes in future extreme precipitation: A statistical event-based approach.

    NASA Astrophysics Data System (ADS)

    Manola, Iris; van den Hurk, Bart; de Moel, Hans; Aerts, Jeroen

    2017-04-01

    Short-lived precipitation extremes are often responsible for hazards in urban and rural environments with economic and environmental consequences. The precipitation intensity is expected to increase about 7% per degree of warming, according to the Clausius-Clapeyron (CC) relation. However, the observations often show a much stronger increase in the sub-daily values. In particular, the behavior of the hourly summer precipitation from radar observations with the dew point temperature (the Pi-Td relation) for the Netherlands suggests that for moderate to warm days the intensification of the precipitation can be even higher than 21% per degree of warming, that is 3 times higher than the expected CC relation. The rate of change depends on the initial precipitation intensity, as low percentiles increase with a rate below CC, the medium percentiles with 2CC and the moderate-high and high percentiles with 3CC. This non-linear statistical Pi-Td relation is suggested to be used as a delta-transformation to project how a historic extreme precipitation event would intensify under future, warmer conditions. Here, the Pi-Td relation is applied over a selected historic extreme precipitation event to 'up-scale' its intensity to warmer conditions. Additionally, the selected historic event is simulated in the high-resolution, convective-permitting weather model Harmonie. The initial and boundary conditions are alternated to represent future conditions. The comparison between the statistical and the numerical method of projecting the historic event to future conditions showed comparable intensity changes, which depending on the initial percentile intensity, range from below CC to a 3CC rate of change per degree of warming. The model tends to overestimate the future intensities for the low- and the very high percentiles and the clouds are somewhat displaced, due to small wind and convection changes. The total spatial cloud coverage in the model remains, as also in the statistical method, unchanged. The advantages of the suggested Pi-Td method of projecting future precipitation events from historic events is that it is simple to use, is less expensive time, computational and resource wise compared to a numerical model. The outcome can be used directly for hydrological and climatological studies and for impact analysis such as for flood risk assessments.

  1. A biological compression model and its applications.

    PubMed

    Cao, Minh Duc; Dix, Trevor I; Allison, Lloyd

    2011-01-01

    A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence alignment and phylogenetic analysis. We demonstrate that the model can handle statistically biased sequences and distantly related sequences where conventional knowledge discovery tools often fail.

  2. The value of model averaging and dynamical climate model predictions for improving statistical seasonal streamflow forecasts over Australia

    NASA Astrophysics Data System (ADS)

    Pokhrel, Prafulla; Wang, Q. J.; Robertson, David E.

    2013-10-01

    Seasonal streamflow forecasts are valuable for planning and allocation of water resources. In Australia, the Bureau of Meteorology employs a statistical method to forecast seasonal streamflows. The method uses predictors that are related to catchment wetness at the start of a forecast period and to climate during the forecast period. For the latter, a predictor is selected among a number of lagged climate indices as candidates to give the "best" model in terms of model performance in cross validation. This study investigates two strategies for further improvement in seasonal streamflow forecasts. The first is to combine, through Bayesian model averaging, multiple candidate models with different lagged climate indices as predictors, to take advantage of different predictive strengths of the multiple models. The second strategy is to introduce additional candidate models, using rainfall and sea surface temperature predictions from a global climate model as predictors. This is to take advantage of the direct simulations of various dynamic processes. The results show that combining forecasts from multiple statistical models generally yields more skillful forecasts than using only the best model and appears to moderate the worst forecast errors. The use of rainfall predictions from the dynamical climate model marginally improves the streamflow forecasts when viewed over all the study catchments and seasons, but the use of sea surface temperature predictions provide little additional benefit.

  3. Statistical shape (ASM) and appearance (AAM) models for the segmentation of the cerebellum in fetal ultrasound

    NASA Astrophysics Data System (ADS)

    Reyes López, Misael; Arámbula Cosío, Fernando

    2017-11-01

    The cerebellum is an important structure to determine the gestational age of the fetus, moreover most of the abnormalities it presents are related to growth disorders. In this work, we present the results of the segmentation of the fetal cerebellum applying statistical shape and appearance models. Both models were tested on ultrasound images of the fetal brain taken from 23 pregnant women, between 18 and 24 gestational weeks. The accuracy results obtained on 11 ultrasound images show a mean Hausdorff distance of 6.08 mm between the manual segmentation and the segmentation using active shape model, and a mean Hausdorff distance of 7.54 mm between the manual segmentation and the segmentation using active appearance model. The reported results demonstrate that the active shape model is more robust in the segmentation of the fetal cerebellum in ultrasound images.

  4. Statistical Mechanics Model of Solids with Defects

    NASA Astrophysics Data System (ADS)

    Kaufman, M.; Walters, P. A.; Ferrante, J.

    1997-03-01

    Previously(M.Kaufman, J.Ferrante,NASA Tech. Memor.,1996), we examined the phase diagram for the failure of a solid under isotropic expansion and compression as a function of stress and temperature with the "springs" modelled by the universal binding energy relation (UBER)(J.H.Rose, J.R.Smith, F.Guinea, J.Ferrante, Phys.Rev.B29, 2963 (1984)). In the previous calculation we assumed that the "springs" failed independently and that the strain is uniform. In the present work, we have extended this statistical model of mechanical failure by allowing for correlations between "springs" and for thermal fluctuations in strains. The springs are now modelled in the harmonic approximation with a failure threshold energy E0, as an intermediate step in future studies to reinclude the full non-linear dependence of the UBER for modelling the interactions. We use the Migdal-Kadanoff renormalization-group method to determine the phase diagram of the model and to compute the free energy.

  5. Likelihoods for fixed rank nomination networks

    PubMed Central

    HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE

    2014-01-01

    Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586

  6. Spatial diffusion of influenza outbreak-related climate factors in Chiang Mai Province, Thailand.

    PubMed

    Nakapan, Supachai; Tripathi, Nitin Kumar; Tipdecho, Taravudh; Souris, Marc

    2012-10-24

    Influenza is one of the most important leading causes of respiratory illness in the countries located in the tropical areas of South East Asia and Thailand. In this study the climate factors associated with influenza incidence in Chiang Mai Province, Northern Thailand, were investigated. Identification of factors responsible for influenza outbreaks and the mapping of potential risk areas in Chiang Mai are long overdue. This work examines the association between yearly climate patterns between 2001 and 2008 and influenza outbreaks in the Chiang Mai Province. The climatic factors included the amount of rainfall, percent of rainy days, relative humidity, maximum, minimum temperatures and temperature difference. The study develops a statistical analysis to quantitatively assess the relationship between climate and influenza outbreaks and then evaluate its suitability for predicting influenza outbreaks. A multiple linear regression technique was used to fit the statistical model. The Inverse Distance Weighted (IDW) interpolation and Geographic Information System (GIS) techniques were used in mapping the spatial diffusion of influenza risk zones. The results show that there is a significance correlation between influenza outbreaks and climate factors for the majority of the studied area. A statistical analysis was conducted to assess the validity of the model comparing model outputs and actual outbreaks.

  7. Analysis of variance to assess statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes.

    PubMed

    Makeyev, Oleksandr; Joe, Cody; Lee, Colin; Besio, Walter G

    2017-07-01

    Concentric ring electrodes have shown promise in non-invasive electrophysiological measurement demonstrating their superiority to conventional disc electrodes, in particular, in accuracy of Laplacian estimation. Recently, we have proposed novel variable inter-ring distances concentric ring electrodes. Analytic and finite element method modeling results for linearly increasing distances electrode configurations suggested they may decrease the truncation error resulting in more accurate Laplacian estimates compared to currently used constant inter-ring distances configurations. This study assesses statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes. Full factorial design of analysis of variance was used with one categorical and two numerical factors: the inter-ring distances, the electrode diameter, and the number of concentric rings in the electrode. The response variables were the Relative Error and the Maximum Error of Laplacian estimation computed using a finite element method model for each of the combinations of levels of three factors. Effects of the main factors and their interactions on Relative Error and Maximum Error were assessed and the obtained results suggest that all three factors have statistically significant effects in the model confirming the potential of using inter-ring distances as a means of improving accuracy of Laplacian estimation.

  8. Prediction of local concentration statistics in variably saturated soils: Influence of observation scale and comparison with field data

    NASA Astrophysics Data System (ADS)

    Graham, Wendy; Destouni, Georgia; Demmy, George; Foussereau, Xavier

    1998-07-01

    The methodology developed in Destouni and Graham [Destouni, G., Graham, W.D., 1997. The influence of observation method on local concentration statistics in the subsurface. Water Resour. Res. 33 (4) 663-676.] for predicting locally measured concentration statistics for solute transport in heterogeneous porous media under saturated flow conditions is applied to the prediction of conservative nonreactive solute transport in the vadose zone where observations are obtained by soil coring. Exact analytical solutions are developed for both the mean and variance of solute concentrations measured in discrete soil cores using a simplified physical model for vadose-zone flow and solute transport. Theoretical results show that while the ensemble mean concentration is relatively insensitive to the length-scale of the measurement, predictions of the concentration variance are significantly impacted by the sampling interval. Results also show that accounting for vertical heterogeneity in the soil profile results in significantly less spreading in the mean and variance of the measured solute breakthrough curves, indicating that it is important to account for vertical heterogeneity even for relatively small travel distances. Model predictions for both the mean and variance of locally measured solute concentration, based on independently estimated model parameters, agree well with data from a field tracer test conducted in Manatee County, Florida.

  9. Comparisons of lung tumour mortality risk in the Japanese A-bomb survivors and in the Colorado Plateau uranium miners: support for the ICRP lung model.

    PubMed

    Little, M P

    2002-03-01

    To estimate the ratio of risks for exposure to radon progeny relative to low-LET radiation based on human lung cancer data, taking account of possible time and age variations in radiation-induced lung cancer risk. Fitting two sorts of time- and age-adjusted relative risk models to a case-control dataset nested within the Colorado Plateau uranium miner cohort and to the Japanese atomic (A)-bomb survivor mortality data. If all A-bomb survivors are compared with the Colorado data, there are statistically significant (two-sided p < 0.05) differences between the two datasets in the pattern of the variation of relative risk with time after exposure, age at exposure and attained age. The excess relative risk decreases much faster with time, age at exposure and attained age in the Colorado uranium miners than in the Japanese A-bomb survivors. If only male A-bomb survivors are compared with the Colorado data, there are no longer statistically significant differences between the two datasets in the pattern of variation of relative risk with time after exposure, age at exposure or attained age. There are no statistically significant differences between the male and female A-bomb survivors in the speed of reduction of relative risk with time after exposure, age at exposure or attained age, although there are indications of rather faster reduction of relative risk with time and age among male survivors than among female survivors. The implicit risk conversion factor for exposure to radon progeny relative to the A-bomb radiation in the male survivors is 1.8 x 10(-2) Sv WLM(-1) (95% CI 6.1 x10(-3), 1.1 x 10(-1)) using a model with exponential adjustments for the effects of radiation for time since exposure and age at exposure, and 1.9 x 10(-2) Sv WLM(-1) (95% CI 6.2 x 10(-3), 1.6 x 10(-1)) using a model with adjustments for the effects of radiation proportional to powers of time since exposure and attained age. Estimates of the risk conversion factor calculated using variant assumptions as to the definition of lung cancer in the Colorado data, or by excluding miners for whom exposure estimates may be less reliable, are very similar. The absence of information on cigarette smoking in the Japanese A-bomb survivors, and the possibility that this may confound the time trends in radiation-induced lung cancer risk in that cohort, imply that these findings should be interpreted with caution. There are no statistically significant differences between the male A-bomb survivors data and the Colorado miner data in the pattern of variation of relative risk with time after exposure and age at exposure. The risk conversion factor is very close to the value suggested by the latest ICRP lung model, albeit with substantial uncertainties.

  10. On the Statistical Errors of RADAR Location Sensor Networks with Built-In Wi-Fi Gaussian Linear Fingerprints

    PubMed Central

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis. PMID:22737027

  11. On the statistical errors of RADAR location sensor networks with built-in Wi-Fi Gaussian linear fingerprints.

    PubMed

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis.

  12. Comment on “Two statistics for evaluating parameter identifiability and error reduction” by John Doherty and Randall J. Hunt

    USGS Publications Warehouse

    Hill, Mary C.

    2010-01-01

    Doherty and Hunt (2009) present important ideas for first-order-second moment sensitivity analysis, but five issues are discussed in this comment. First, considering the composite-scaled sensitivity (CSS) jointly with parameter correlation coefficients (PCC) in a CSS/PCC analysis addresses the difficulties with CSS mentioned in the introduction. Second, their new parameter identifiability statistic actually is likely to do a poor job of parameter identifiability in common situations. The statistic instead performs the very useful role of showing how model parameters are included in the estimated singular value decomposition (SVD) parameters. Its close relation to CSS is shown. Third, the idea from p. 125 that a suitable truncation point for SVD parameters can be identified using the prediction variance is challenged using results from Moore and Doherty (2005). Fourth, the relative error reduction statistic of Doherty and Hunt is shown to belong to an emerging set of statistics here named perturbed calculated variance statistics. Finally, the perturbed calculated variance statistics OPR and PPR mentioned on p. 121 are shown to explicitly include the parameter null-space component of uncertainty. Indeed, OPR and PPR results that account for null-space uncertainty have appeared in the literature since 2000.

  13. Low-Level Contrast Statistics of Natural Images Can Modulate the Frequency of Event-Related Potentials (ERP) in Humans.

    PubMed

    Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali

    2016-01-01

    Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3-7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.

  14. Low-Level Contrast Statistics of Natural Images Can Modulate the Frequency of Event-Related Potentials (ERP) in Humans

    PubMed Central

    Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali

    2016-01-01

    Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception. PMID:28018197

  15. Predicting long-term catchment nutrient export: the use of nonlinear time series models

    NASA Astrophysics Data System (ADS)

    Valent, Peter; Howden, Nicholas J. K.; Szolgay, Jan; Komornikova, Magda

    2010-05-01

    After the Second World War the nitrate concentrations in European water bodies changed significantly as the result of increased nitrogen fertilizer use and changes in land use. However, in the last decades, as a consequence of the implementation of nitrate-reducing measures in Europe, the nitrate concentrations in water bodies slowly decrease. This causes that the mean and variance of the observed time series also changes with time (nonstationarity and heteroscedascity). In order to detect changes and properly describe the behaviour of such time series by time series analysis, linear models (such as autoregressive (AR), moving average (MA) and autoregressive moving average models (ARMA)), are no more suitable. Time series with sudden changes in statistical characteristics can cause various problems in the calibration of traditional water quality models and thus give biased predictions. Proper statistical analysis of these non-stationary and heteroscedastic time series with the aim of detecting and subsequently explaining the variations in their statistical characteristics requires the use of nonlinear time series models. This information can be then used to improve the model building and calibration of conceptual water quality model or to select right calibration periods in order to produce reliable predictions. The objective of this contribution is to analyze two long time series of nitrate concentrations of the rivers Ouse and Stour with advanced nonlinear statistical modelling techniques and compare their performance with traditional linear models of the ARMA class in order to identify changes in the time series characteristics. The time series were analysed with nonlinear models with multiple regimes represented by self-exciting threshold autoregressive (SETAR) and Markov-switching models (MSW). The analysis showed that, based on the value of residual sum of squares (RSS) in both datasets, SETAR and MSW models described the time-series better than models of the ARMA class. In most cases the relative improvement of SETAR models against AR models of first order was low ranging between 1% and 4% with the exception of the three-regime model for the River Stour time-series where the improvement was 48.9%. In comparison, the relative improvement of MSW models was between 44.6% and 52.5 for two-regime and from 60.4% to 75% for three-regime models. However, the visual assessment of models plotted against original datasets showed that despite a high value of RSS, some ARMA models could describe the analyzed time-series better than AR, MA and SETAR models with lower values of RSS. In both datasets MSW models provided a very good visual fit describing most of the extreme values.

  16. Conjoint Analysis: A Study of the Effects of Using Person Variables.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…

  17. Conceptualizations of Personality Disorders with the Five Factor Model-Count and Empathy Traits

    ERIC Educational Resources Information Center

    Kajonius, Petri J.; Dåderman, Anna M.

    2017-01-01

    Previous research has long advocated that emotional and behavioral disorders are related to general personality traits, such as the Five Factor Model (FFM). The addition of section III in the latest "Diagnostic and Statistical Manual of Mental Disorders" (DSM) recommends that extremity in personality traits together with maladaptive…

  18. Progress in the improved lattice calculation of direct CP-violation in the Standard Model

    NASA Astrophysics Data System (ADS)

    Kelly, Christopher

    2018-03-01

    We discuss the ongoing effort by the RBC & UKQCD collaborations to improve our lattice calculation of the measure of Standard Model direct CP violation, ɛ', with physical kinematics. We present our progress in decreasing the (dominant) statistical error and discuss other related activities aimed at reducing the systematic errors.

  19. Multiscale Modeling of Gene-Behavior Associations in an Artificial Neural Network Model of Cognitive Development

    ERIC Educational Resources Information Center

    Thomas, Michael S. C.; Forrester, Neil A.; Ronald, Angelica

    2016-01-01

    In the multidisciplinary field of developmental cognitive neuroscience, statistical associations between levels of description play an increasingly important role. One example of such associations is the observation of correlations between relatively common gene variants and individual differences in behavior. It is perhaps surprising that such…

  20. Testing the Self-Efficacy-Performance Linkage of Social-Cognitive Theory.

    ERIC Educational Resources Information Center

    Harrison, Allison W.; Rainer, R. Kelly, Jr.; Hochwarter, Wayne A.; Thompson, Kenneth R.

    1997-01-01

    Briefly reviews Albert Bandura's Self-Efficacy Performance Model (ability to perform a task is influenced by an individual's belief in their capability). Tests this model with a sample of 776 university employees and computer-related knowledge and skills. Results supported Bandura's thesis. Includes statistical tables and a discussion of related…

  1. A Voice from the Trenches: A Reaction to Ivey and Ivey (1998).

    ERIC Educational Resources Information Center

    Hinkle, J. Scott

    1999-01-01

    Offers reaction to Ivey and Ivey's article regarding the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders. Discusses the medical model versus the developmental model in relation to counselor education and training, diagnostic bias, and the future identity of professional counselors. Concludes that defining a theoretical…

  2. Assessment of six dissimilarity metrics for climate analogues

    NASA Astrophysics Data System (ADS)

    Grenier, Patrick; Parent, Annie-Claude; Huard, David; Anctil, François; Chaumont, Diane

    2013-04-01

    Spatial analogue techniques consist in identifying locations whose recent-past climate is similar in some aspects to the future climate anticipated at a reference location. When identifying analogues, one key step is the quantification of the dissimilarity between two climates separated in time and space, which involves the choice of a metric. In this communication, spatial analogues and their usefulness are briefly discussed. Next, six metrics are presented (the standardized Euclidean distance, the Kolmogorov-Smirnov statistic, the nearest-neighbor distance, the Zech-Aslan energy statistic, the Friedman-Rafsky runs statistic and the Kullback-Leibler divergence), along with a set of criteria used for their assessment. The related case study involves the use of numerical simulations performed with the Canadian Regional Climate Model (CRCM-v4.2.3), from which three annual indicators (total precipitation, heating degree-days and cooling degree-days) are calculated over 30-year periods (1971-2000 and 2041-2070). Results indicate that the six metrics identify comparable analogue regions at a relatively large scale, but best analogues may differ substantially. For best analogues, it is also shown that the uncertainty stemming from the metric choice does generally not exceed that stemming from the simulation or model choice. A synthesis of the advantages and drawbacks of each metric is finally presented, in which the Zech-Aslan energy statistic stands out as the most recommended metric for analogue studies, whereas the Friedman-Rafsky runs statistic is the least recommended, based on this case study.

  3. Trauma-related dissociation and altered states of consciousness: a call for clinical, treatment, and neuroscience research

    PubMed Central

    Lanius, Ruth A.

    2015-01-01

    The primary aim of this commentary is to describe trauma-related dissociation and altered states of consciousness in the context of a four-dimensional model that has recently been proposed (Frewen & Lanius, 2015). This model categorizes symptoms of trauma-related psychopathology into (1) those that occur within normal waking consciousness and (2) those that are dissociative and are associated with trauma-related altered states of consciousness (TRASC) along four dimensions: (1) time; (2) thought; (3) body; and (4) emotion. Clinical applications and future research directions relevant to each dimension are discussed. Conceptualizing TRASC across the dimensions of time, thought, body, and emotion has transdiagnostic implications for trauma-related disorders described in both the Diagnostic Statistical Manual and the International Classifications of Diseases. The four-dimensional model provides a framework, guided by existing models of dissociation, for future research examining the phenomenological, neurobiological, and physiological underpinnings of trauma-related dissociation. PMID:25994026

  4. Pitfalls in statistical landslide susceptibility modelling

    NASA Astrophysics Data System (ADS)

    Schröder, Boris; Vorpahl, Peter; Märker, Michael; Elsenbeer, Helmut

    2010-05-01

    The use of statistical methods is a well-established approach to predict landslide occurrence probabilities and to assess landslide susceptibility. This is achieved by applying statistical methods relating historical landslide inventories to topographic indices as predictor variables. In our contribution, we compare several new and powerful methods developed in machine learning and well-established in landscape ecology and macroecology for predicting the distribution of shallow landslides in tropical mountain rainforests in southern Ecuador (among others: boosted regression trees, multivariate adaptive regression splines, maximum entropy). Although these methods are powerful, we think it is necessary to follow a basic set of guidelines to avoid some pitfalls regarding data sampling, predictor selection, and model quality assessment, especially if a comparison of different models is contemplated. We therefore suggest to apply a novel toolbox to evaluate approaches to the statistical modelling of landslide susceptibility. Additionally, we propose some methods to open the "black box" as an inherent part of machine learning methods in order to achieve further explanatory insights into preparatory factors that control landslides. Sampling of training data should be guided by hypotheses regarding processes that lead to slope failure taking into account their respective spatial scales. This approach leads to the selection of a set of candidate predictor variables considered on adequate spatial scales. This set should be checked for multicollinearity in order to facilitate model response curve interpretation. Model quality assesses how well a model is able to reproduce independent observations of its response variable. This includes criteria to evaluate different aspects of model performance, i.e. model discrimination, model calibration, and model refinement. In order to assess a possible violation of the assumption of independency in the training samples or a possible lack of explanatory information in the chosen set of predictor variables, the model residuals need to be checked for spatial auto¬correlation. Therefore, we calculate spline correlograms. In addition to this, we investigate partial dependency plots and bivariate interactions plots considering possible interactions between predictors to improve model interpretation. Aiming at presenting this toolbox for model quality assessment, we investigate the influence of strategies in the construction of training datasets for statistical models on model quality.

  5. Translating statistical species-habitat models to interactive decision support tools

    USGS Publications Warehouse

    Wszola, Lyndsie S.; Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.

  6. Translating statistical species-habitat models to interactive decision support tools.

    PubMed

    Wszola, Lyndsie S; Simonsen, Victoria L; Stuber, Erica F; Gillespie, Caitlyn R; Messinger, Lindsey N; Decker, Karie L; Lusk, Jeffrey J; Jorgensen, Christopher F; Bishop, Andrew A; Fontaine, Joseph J

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.

  7. Translating statistical species-habitat models to interactive decision support tools

    PubMed Central

    Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences. PMID:29236707

  8. Adolescent Family Experiences and Educational Attainment during Early Adulthood

    PubMed Central

    Melby, Janet N.; Conger, Rand D.; Fang, Shu-Ann; Wickrama, K. A. S.; Conger, Katherine J.

    2009-01-01

    This study investigated the degree to which a family investment model would help account for the association between family of origin socioeconomic characteristics and the later educational attainment of 451 young adults (age 26) from two-parent families. Parents’ educational level, occupational prestige, and family income in 1989 each had a statistically significant direct relationship with youths’ educational attainment in 2002. Consistent with the theoretical model guiding the study, parents’ educational level and family income also demonstrated statistically significant indirect effects on later educational attainment through their associations with growth trajectories for supportive parenting, sibling relations, and adolescent academic engagement. Supportive parenting and sibling relations were linked to later educational attainment through their association with adolescent academic engagement. Academic engagement during adolescence was associated with educational attainment in young adulthood. These basic processes operated similarly regardless of youths’ gender, target youths’ age relative to a near-age sibling, gender composition of the sibling dyad, or gender of parent. PMID:18999319

  9. Hydrologic consistency as a basis for assessing complexity of monthly water balance models for the continental United States

    NASA Astrophysics Data System (ADS)

    Martinez, Guillermo F.; Gupta, Hoshin V.

    2011-12-01

    Methods to select parsimonious and hydrologically consistent model structures are useful for evaluating dominance of hydrologic processes and representativeness of data. While information criteria (appropriately constrained to obey underlying statistical assumptions) can provide a basis for evaluating appropriate model complexity, it is not sufficient to rely upon the principle of maximum likelihood (ML) alone. We suggest that one must also call upon a "principle of hydrologic consistency," meaning that selected ML structures and parameter estimates must be constrained (as well as possible) to reproduce desired hydrological characteristics of the processes under investigation. This argument is demonstrated in the context of evaluating the suitability of candidate model structures for lumped water balance modeling across the continental United States, using data from 307 snow-free catchments. The models are constrained to satisfy several tests of hydrologic consistency, a flow space transformation is used to ensure better consistency with underlying statistical assumptions, and information criteria are used to evaluate model complexity relative to the data. The results clearly demonstrate that the principle of consistency provides a sensible basis for guiding selection of model structures and indicate strong spatial persistence of certain model structures across the continental United States. Further work to untangle reasons for model structure predominance can help to relate conceptual model structures to physical characteristics of the catchments, facilitating the task of prediction in ungaged basins.

  10. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE PAGES

    Ipsen, Andreas

    2017-02-03

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  11. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ipsen, Andreas

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  12. Zero-state Markov switching count-data models: an empirical assessment.

    PubMed

    Malyshkina, Nataliya V; Mannering, Fred L

    2010-01-01

    In this study, a two-state Markov switching count-data model is proposed as an alternative to zero-inflated models to account for the preponderance of zeros sometimes observed in transportation count data, such as the number of accidents occurring on a roadway segment over some period of time. For this accident-frequency case, zero-inflated models assume the existence of two states: one of the states is a zero-accident count state, which has accident probabilities that are so low that they cannot be statistically distinguished from zero, and the other state is a normal-count state, in which counts can be non-negative integers that are generated by some counting process, for example, a Poisson or negative binomial. While zero-inflated models have come under some criticism with regard to accident-frequency applications - one fact is undeniable - in many applications they provide a statistically superior fit to the data. The Markov switching approach we propose seeks to overcome some of the criticism associated with the zero-accident state of the zero-inflated model by allowing individual roadway segments to switch between zero and normal-count states over time. An important advantage of this Markov switching approach is that it allows for the direct statistical estimation of the specific roadway-segment state (i.e., zero-accident or normal-count state) whereas traditional zero-inflated models do not. To demonstrate the applicability of this approach, a two-state Markov switching negative binomial model (estimated with Bayesian inference) and standard zero-inflated negative binomial models are estimated using five-year accident frequencies on Indiana interstate highway segments. It is shown that the Markov switching model is a viable alternative and results in a superior statistical fit relative to the zero-inflated models.

  13. Microgravity experiments on vibrated granular gases in a dilute regime: non-classical statistics

    NASA Astrophysics Data System (ADS)

    Leconte, M.; Garrabos, Y.; Falcon, E.; Lecoutre-Chabot, C.; Palencia, F.; Évesque, P.; Beysens, D.

    2006-07-01

    We report on an experimental study of a dilute gas of steel spheres colliding inelastically and excited by a piston performing sinusoidal vibration, in low gravity. Using improved experimental apparatus, here we present some results concerning the collision statistics of particles on a wall of the container. We also propose a simple model where the non-classical statistics obtained from our data are attributed to the boundary condition playing the role of a 'velostat' instead of a thermostat. The significant differences from the kinetic theory of usual gas are related to the inelasticity of collisions.

  14. The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping.

    PubMed

    Bahlmann, Claus; Burkhardt, Hans

    2004-03-01

    In this paper, we give a comprehensive description of our writer-independent online handwriting recognition system frog on hand. The focus of this work concerns the presentation of the classification/training approach, which we call cluster generative statistical dynamic time warping (CSDTW). CSDTW is a general, scalable, HMM-based method for variable-sized, sequential data that holistically combines cluster analysis and statistical sequence modeling. It can handle general classification problems that rely on this sequential type of data, e.g., speech recognition, genome processing, robotics, etc. Contrary to previous attempts, clustering and statistical sequence modeling are embedded in a single feature space and use a closely related distance measure. We show character recognition experiments of frog on hand using CSDTW on the UNIPEN online handwriting database. The recognition accuracy is significantly higher than reported results of other handwriting recognition systems. Finally, we describe the real-time implementation of frog on hand on a Linux Compaq iPAQ embedded device.

  15. Dissolution curve comparisons through the F(2) parameter, a Bayesian extension of the f(2) statistic.

    PubMed

    Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan

    2015-01-01

    Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.

  16. The Potential for Differential Findings among Invariance Testing Strategies for Multisample Measured Variable Path Models

    ERIC Educational Resources Information Center

    Mann, Heather M.; Rutstein, Daisy W.; Hancock, Gregory R.

    2009-01-01

    Multisample measured variable path analysis is used to test whether causal/structural relations among measured variables differ across populations. Several invariance testing approaches are available for assessing cross-group equality of such relations, but the associated test statistics may vary considerably across methods. This study is a…

  17. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations and communities

    USGS Publications Warehouse

    Royle, J. Andrew; Dorazio, Robert M.

    2008-01-01

    A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.

  18. Simulating statistics of lightning-induced and man made fires

    NASA Astrophysics Data System (ADS)

    Krenn, R.; Hergarten, S.

    2009-04-01

    The frequency-area distributions of forest fires show power-law behavior with scaling exponents α in a quite narrow range, relating wildfire research to the theoretical framework of self-organized criticality. Examples of self-organized critical behavior can be found in computer simulations of simple cellular automata. The established self-organized critical Drossel-Schwabl forest fire model (DS-FFM) is one of the most widespread models in this context. Despite its qualitative agreement with event-size statistics from nature, its applicability is still questioned. Apart from general concerns that the DS-FFM apparently oversimplifies the complex nature of forest dynamics, it significantly overestimates the frequency of large fires. We present a straightforward modification of the model rules that increases the scaling exponent α by approximately 1•3 and brings the simulated event-size statistics close to those observed in nature. In addition, combined simulations of both the original and the modified model predict a dependence of the overall distribution on the ratio of lightning induced and man made fires as well as a difference between their respective event-size statistics. The increase of the scaling exponent with decreasing lightning probability as well as the splitting of the partial distributions are confirmed by the analysis of the Canadian Large Fire Database. As a consequence, lightning induced and man made forest fires cannot be treated separately in wildfire modeling, hazard assessment and forest management.

  19. Logical reasoning versus information processing in the dual-strategy model of reasoning.

    PubMed

    Markovits, Henry; Brisson, Janie; de Chantal, Pier-Luc

    2017-01-01

    One of the major debates concerning the nature of inferential reasoning is between counterexample-based strategies such as mental model theory and statistical strategies underlying probabilistic models. The dual-strategy model, proposed by Verschueren, Schaeken, & d'Ydewalle (2005a, 2005b), which suggests that people might have access to both kinds of strategy has been supported by several recent studies. These have shown that statistical reasoners make inferences based on using information about premises in order to generate a likelihood estimate of conclusion probability. However, while results concerning counterexample reasoners are consistent with a counterexample detection model, these results could equally be interpreted as indicating a greater sensitivity to logical form. In order to distinguish these 2 interpretations, in Studies 1 and 2, we presented reasoners with Modus ponens (MP) inferences with statistical information about premise strength and in Studies 3 and 4, naturalistic MP inferences with premises having many disabling conditions. Statistical reasoners accepted the MP inference more often than counterexample reasoners in Studies 1 and 2, while the opposite pattern was observed in Studies 3 and 4. Results show that these strategies must be defined in terms of information processing, with no clear relations to "logical" reasoning. These results have additional implications for the underlying debate about the nature of human reasoning. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  20. A comparison of hydrologic models for ecological flows and water availability

    USGS Publications Warehouse

    Caldwell, Peter V; Kennen, Jonathan G.; Sun, Ge; Kiang, Julie E.; Butcher, John B; Eddy, Michelle C; Hay, Lauren E.; LaFontaine, Jacob H.; Hain, Ernie F.; Nelson, Stacy C; McNulty, Steve G

    2015-01-01

    Robust hydrologic models are needed to help manage water resources for healthy aquatic ecosystems and reliable water supplies for people, but there is a lack of comprehensive model comparison studies that quantify differences in streamflow predictions among model applications developed to answer management questions. We assessed differences in daily streamflow predictions by four fine-scale models and two regional-scale monthly time step models by comparing model fit statistics and bias in ecologically relevant flow statistics (ERFSs) at five sites in the Southeastern USA. Models were calibrated to different extents, including uncalibrated (level A), calibrated to a downstream site (level B), calibrated specifically for the site (level C) and calibrated for the site with adjusted precipitation and temperature inputs (level D). All models generally captured the magnitude and variability of observed streamflows at the five study sites, and increasing level of model calibration generally improved performance. All models had at least 1 of 14 ERFSs falling outside a +/−30% range of hydrologic uncertainty at every site, and ERFSs related to low flows were frequently over-predicted. Our results do not indicate that any specific hydrologic model is superior to the others evaluated at all sites and for all measures of model performance. Instead, we provide evidence that (1) model performance is as likely to be related to calibration strategy as it is to model structure and (2) simple, regional-scale models have comparable performance to the more complex, fine-scale models at a monthly time step.

  1. Online Statistical Modeling (Regression Analysis) for Independent Responses

    NASA Astrophysics Data System (ADS)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  2. Defect-phase-dynamics approach to statistical domain-growth problem of clock models

    NASA Technical Reports Server (NTRS)

    Kawasaki, K.

    1985-01-01

    The growth of statistical domains in quenched Ising-like p-state clock models with p = 3 or more is investigated theoretically, reformulating the analysis of Ohta et al. (1982) in terms of a phase variable and studying the dynamics of defects introduced into the phase field when the phase variable becomes multivalued. The resulting defect/phase domain-growth equation is applied to the interpretation of Monte Carlo simulations in two dimensions (Kaski and Gunton, 1983; Grest and Srolovitz, 1984), and problems encountered in the analysis of related Potts models are discussed. In the two-dimensional case, the problem is essentially that of a purely dissipative Coulomb gas, with a sq rt t growth law complicated by vertex-pinning effects at small t.

  3. Modeling Statistics of Fish Patchiness and Predicting Associated Influence on Statistics of Acoustic Echoes

    DTIC Science & Technology

    2015-09-30

    part from data analyzed in this project. This work involved informal collaborations with Chris Wilson of NOAA Alaska Fisheries , whose team collected...characteristics of animal groups such as schools, swarms and flocks arise from individuals’ immediate responses to the relative positions and velocities of...infrastructure to extract cognitive behavior and other parameters from the NOAA Alaska Fisheries Science Center acoustic/trawl walleye pollock survey data

  4. Statistical Modeling of Retinal Optical Coherence Tomography.

    PubMed

    Amini, Zahra; Rabbani, Hossein

    2016-06-01

    In this paper, a new model for retinal Optical Coherence Tomography (OCT) images is proposed. This statistical model is based on introducing a nonlinear Gaussianization transform to convert the probability distribution function (pdf) of each OCT intra-retinal layer to a Gaussian distribution. The retina is a layered structure and in OCT each of these layers has a specific pdf which is corrupted by speckle noise, therefore a mixture model for statistical modeling of OCT images is proposed. A Normal-Laplace distribution, which is a convolution of a Laplace pdf and Gaussian noise, is proposed as the distribution of each component of this model. The reason for choosing Laplace pdf is the monotonically decaying behavior of OCT intensities in each layer for healthy cases. After fitting a mixture model to the data, each component is gaussianized and all of them are combined by Averaged Maximum A Posterior (AMAP) method. To demonstrate the ability of this method, a new contrast enhancement method based on this statistical model is proposed and tested on thirteen healthy 3D OCTs taken by the Topcon 3D OCT and five 3D OCTs from Age-related Macular Degeneration (AMD) patients, taken by Zeiss Cirrus HD-OCT. Comparing the results with two contending techniques, the prominence of the proposed method is demonstrated both visually and numerically. Furthermore, to prove the efficacy of the proposed method for a more direct and specific purpose, an improvement in the segmentation of intra-retinal layers using the proposed contrast enhancement method as a preprocessing step, is demonstrated.

  5. Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping

    NASA Astrophysics Data System (ADS)

    Kubica, Aleksander; Beverland, Michael E.; Brandão, Fernando; Preskill, John; Svore, Krysta M.

    2018-05-01

    Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p3DCC (1 )≃1.9 % and p3DCC (2 )≃27.6 % . We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.

  6. Statistical ensembles for money and debt

    NASA Astrophysics Data System (ADS)

    Viaggiu, Stefano; Lionetto, Andrea; Bargigli, Leonardo; Longo, Michele

    2012-10-01

    We build a statistical ensemble representation of two economic models describing respectively, in simplified terms, a payment system and a credit market. To this purpose we adopt the Boltzmann-Gibbs distribution where the role of the Hamiltonian is taken by the total money supply (i.e. including money created from debt) of a set of interacting economic agents. As a result, we can read the main thermodynamic quantities in terms of monetary ones. In particular, we define for the credit market model a work term which is related to the impact of monetary policy on credit creation. Furthermore, with our formalism we recover and extend some results concerning the temperature of an economic system, previously presented in the literature by considering only the monetary base as a conserved quantity. Finally, we study the statistical ensemble for the Pareto distribution.

  7. Standard and reduced radiation dose liver CT images: adaptive statistical iterative reconstruction versus model-based iterative reconstruction-comparison of findings and image quality.

    PubMed

    Shuman, William P; Chan, Keith T; Busey, Janet M; Mitsumori, Lee M; Choi, Eunice; Koprowicz, Kent M; Kanal, Kalpana M

    2014-12-01

    To investigate whether reduced radiation dose liver computed tomography (CT) images reconstructed with model-based iterative reconstruction ( MBIR model-based iterative reconstruction ) might compromise depiction of clinically relevant findings or might have decreased image quality when compared with clinical standard radiation dose CT images reconstructed with adaptive statistical iterative reconstruction ( ASIR adaptive statistical iterative reconstruction ). With institutional review board approval, informed consent, and HIPAA compliance, 50 patients (39 men, 11 women) were prospectively included who underwent liver CT. After a portal venous pass with ASIR adaptive statistical iterative reconstruction images, a 60% reduced radiation dose pass was added with MBIR model-based iterative reconstruction images. One reviewer scored ASIR adaptive statistical iterative reconstruction image quality and marked findings. Two additional independent reviewers noted whether marked findings were present on MBIR model-based iterative reconstruction images and assigned scores for relative conspicuity, spatial resolution, image noise, and image quality. Liver and aorta Hounsfield units and image noise were measured. Volume CT dose index and size-specific dose estimate ( SSDE size-specific dose estimate ) were recorded. Qualitative reviewer scores were summarized. Formal statistical inference for signal-to-noise ratio ( SNR signal-to-noise ratio ), contrast-to-noise ratio ( CNR contrast-to-noise ratio ), volume CT dose index, and SSDE size-specific dose estimate was made (paired t tests), with Bonferroni adjustment. Two independent reviewers identified all 136 ASIR adaptive statistical iterative reconstruction image findings (n = 272) on MBIR model-based iterative reconstruction images, scoring them as equal or better for conspicuity, spatial resolution, and image noise in 94.1% (256 of 272), 96.7% (263 of 272), and 99.3% (270 of 272), respectively. In 50 image sets, two reviewers (n = 100) scored overall image quality as sufficient or good with MBIR model-based iterative reconstruction in 99% (99 of 100). Liver SNR signal-to-noise ratio was significantly greater for MBIR model-based iterative reconstruction (10.8 ± 2.5 [standard deviation] vs 7.7 ± 1.4, P < .001); there was no difference for CNR contrast-to-noise ratio (2.5 ± 1.4 vs 2.4 ± 1.4, P = .45). For ASIR adaptive statistical iterative reconstruction and MBIR model-based iterative reconstruction , respectively, volume CT dose index was 15.2 mGy ± 7.6 versus 6.2 mGy ± 3.6; SSDE size-specific dose estimate was 16.4 mGy ± 6.6 versus 6.7 mGy ± 3.1 (P < .001). Liver CT images reconstructed with MBIR model-based iterative reconstruction may allow up to 59% radiation dose reduction compared with the dose with ASIR adaptive statistical iterative reconstruction , without compromising depiction of findings or image quality. © RSNA, 2014.

  8. Spatial Statistical Data Fusion (SSDF)

    NASA Technical Reports Server (NTRS)

    Braverman, Amy J.; Nguyen, Hai M.; Cressie, Noel

    2013-01-01

    As remote sensing for scientific purposes has transitioned from an experimental technology to an operational one, the selection of instruments has become more coordinated, so that the scientific community can exploit complementary measurements. However, tech nological and scientific heterogeneity across devices means that the statistical characteristics of the data they collect are different. The challenge addressed here is how to combine heterogeneous remote sensing data sets in a way that yields optimal statistical estimates of the underlying geophysical field, and provides rigorous uncertainty measures for those estimates. Different remote sensing data sets may have different spatial resolutions, different measurement error biases and variances, and other disparate characteristics. A state-of-the-art spatial statistical model was used to relate the true, but not directly observed, geophysical field to noisy, spatial aggregates observed by remote sensing instruments. The spatial covariances of the true field and the covariances of the true field with the observations were modeled. The observations are spatial averages of the true field values, over pixels, with different measurement noise superimposed. A kriging framework is used to infer optimal (minimum mean squared error and unbiased) estimates of the true field at point locations from pixel-level, noisy observations. A key feature of the spatial statistical model is the spatial mixed effects model that underlies it. The approach models the spatial covariance function of the underlying field using linear combinations of basis functions of fixed size. Approaches based on kriging require the inversion of very large spatial covariance matrices, and this is usually done by making simplifying assumptions about spatial covariance structure that simply do not hold for geophysical variables. In contrast, this method does not require these assumptions, and is also computationally much faster. This method is fundamentally different than other approaches to data fusion for remote sensing data because it is inferential rather than merely descriptive. All approaches combine data in a way that minimizes some specified loss function. Most of these are more or less ad hoc criteria based on what looks good to the eye, or some criteria that relate only to the data at hand.

  9. 77 FR 41986 - Division of Nursing, Public Health Nursing Community Based Model of PHN Case Management Services

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-17

    ... scientific literature related to epidemiology, statistics, surveillance, Healthy People 2020 Objectives, and... eligible projects or activities. This requirement applies whether the delinquency is attributable to the...

  10. Eutrophication risk assessment in coastal embayments using simple statistical models.

    PubMed

    Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G

    2003-09-01

    A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.

  11. A new method to compare statistical tree growth curves: the PL-GMANOVA model and its application with dendrochronological data.

    PubMed

    Ricker, Martin; Peña Ramírez, Víctor M; von Rosen, Dietrich

    2014-01-01

    Growth curves are monotonically increasing functions that measure repeatedly the same subjects over time. The classical growth curve model in the statistical literature is the Generalized Multivariate Analysis of Variance (GMANOVA) model. In order to model the tree trunk radius (r) over time (t) of trees on different sites, GMANOVA is combined here with the adapted PL regression model Q = A · T+E, where for b ≠ 0 : Q = Ei[-b · r]-Ei[-b · r1] and for b = 0 : Q  = Ln[r/r1], A =  initial relative growth to be estimated, T = t-t1, and E is an error term for each tree and time point. Furthermore, Ei[-b · r]  = ∫(Exp[-b · r]/r)dr, b = -1/TPR, with TPR being the turning point radius in a sigmoid curve, and r1 at t1 is an estimated calibrating time-radius point. Advantages of the approach are that growth rates can be compared among growth curves with different turning point radiuses and different starting points, hidden outliers are easily detectable, the method is statistically robust, and heteroscedasticity of the residuals among time points is allowed. The model was implemented with dendrochronological data of 235 Pinus montezumae trees on ten Mexican volcano sites to calculate comparison intervals for the estimated initial relative growth A. One site (at the Popocatépetl volcano) stood out, with A being 3.9 times the value of the site with the slowest-growing trees. Calculating variance components for the initial relative growth, 34% of the growth variation was found among sites, 31% among trees, and 35% over time. Without the Popocatépetl site, the numbers changed to 7%, 42%, and 51%. Further explanation of differences in growth would need to focus on factors that vary within sites and over time.

  12. A Comparison of Four Estimators of a Population Measure of Model Fit in Covariance Structure Analysis

    ERIC Educational Resources Information Center

    Zhang, Wei

    2008-01-01

    A major issue in the utilization of covariance structure analysis is model fit evaluation. Recent years have witnessed increasing interest in various test statistics and so-called fit indexes, most of which are actually based on or closely related to F[subscript 0], a measure of model fit in the population. This study aims to provide a systematic…

  13. Downscaling of Global Climate Change Estimates to Regional Scales: An Application to Iberian Rainfall in Wintertime.

    NASA Astrophysics Data System (ADS)

    von Storch, Hans; Zorita, Eduardo; Cubasch, Ulrich

    1993-06-01

    A statistical strategy to deduct regional-scale features from climate general circulation model (GCM) simulations has been designed and tested. The main idea is to interrelate the characteristic patterns of observed simultaneous variations of regional climate parameters and of large-scale atmospheric flow using the canonical correlation technique.The large-scale North Atlantic sea level pressure (SLP) is related to the regional, variable, winter (DJF) mean Iberian Peninsula rainfall. The skill of the resulting statistical model is shown by reproducing, to a good approximation, the winter mean Iberian rainfall from 1900 to present from the observed North Atlantic mean SLP distributions. It is shown that this observed relationship between these two variables is not well reproduced in the output of a general circulation model (GCM).The implications for Iberian rainfall changes as the response to increasing atmospheric greenhouse-gas concentrations simulated by two GCM experiments are examined with the proposed statistical model. In an instantaneous `2 C02' doubling experiment, using the simulated change of the mean North Atlantic SLP field to predict Iberian rainfall yields, there is an insignificant increase of area-averaged rainfall of 1 mm/month, with maximum values of 4 mm/month in the northwest of the peninsula. In contrast, for the four GCM grid points representing the Iberian Peninsula, the change is 10 mm/month, with a minimum of 19 mm/month in the southwest. In the second experiment, with the IPCC scenario A ("business as usual") increase Of C02, the statistical-model results partially differ from the directly simulated rainfall changes: in the experimental range of 100 years, the area-averaged rainfall decreases by 7 mm/month (statistical model), and by 9 mm/month (GCM); at the same time the amplitude of the interdecadal variability is quite different.

  14. Reversibility in Quantum Models of Stochastic Processes

    NASA Astrophysics Data System (ADS)

    Gier, David; Crutchfield, James; Mahoney, John; James, Ryan

    Natural phenomena such as time series of neural firing, orientation of layers in crystal stacking and successive measurements in spin-systems are inherently probabilistic. The provably minimal classical models of such stochastic processes are ɛ-machines, which consist of internal states, transition probabilities between states and output values. The topological properties of the ɛ-machine for a given process characterize the structure, memory and patterns of that process. However ɛ-machines are often not ideal because their statistical complexity (Cμ) is demonstrably greater than the excess entropy (E) of the processes they represent. Quantum models (q-machines) of the same processes can do better in that their statistical complexity (Cq) obeys the relation Cμ >= Cq >= E. q-machines can be constructed to consider longer lengths of strings, resulting in greater compression. With code-words of sufficiently long length, the statistical complexity becomes time-symmetric - a feature apparently novel to this quantum representation. This result has ramifications for compression of classical information in quantum computing and quantum communication technology.

  15. Statistical Forecasting of Current and Future Circum-Arctic Ground Temperatures and Active Layer Thickness

    NASA Astrophysics Data System (ADS)

    Aalto, J.; Karjalainen, O.; Hjort, J.; Luoto, M.

    2018-05-01

    Mean annual ground temperature (MAGT) and active layer thickness (ALT) are key to understanding the evolution of the ground thermal state across the Arctic under climate change. Here a statistical modeling approach is presented to forecast current and future circum-Arctic MAGT and ALT in relation to climatic and local environmental factors, at spatial scales unreachable with contemporary transient modeling. After deploying an ensemble of multiple statistical techniques, distance-blocked cross validation between observations and predictions suggested excellent and reasonable transferability of the MAGT and ALT models, respectively. The MAGT forecasts indicated currently suitable conditions for permafrost to prevail over an area of 15.1 ± 2.8 × 106 km2. This extent is likely to dramatically contract in the future, as the results showed consistent, but region-specific, changes in ground thermal regime due to climate change. The forecasts provide new opportunities to assess future Arctic changes in ground thermal state and biogeochemical feedback.

  16. A contact mechanics model for ankle implants with inclusion of surface roughness effects

    NASA Astrophysics Data System (ADS)

    Hodaei, M.; Farhang, K.; Maani, N.

    2014-02-01

    Total ankle replacement is recognized as one of the best procedures to treat painful arthritic ankles. Even though this method can relieve patients from pain and reproduce the physiological functions of the ankle, an improper design can cause an excessive amount of metal debris due to wear, causing toxicity in implant recipient. This paper develops a contact model to treat the interaction of tibia and talus implants in an ankle joint. The contact model describes the interaction of implant rough surfaces including both elastic and plastic deformations. In the model, the tibia and the talus surfaces are viewed as macroscopically conforming cylinders or conforming multi-cylinders containing micrometre-scale roughness. The derived equations relate contact force on the implant and the minimum mean surface separation of the rough surfaces. The force is expressed as a statistical integral function of asperity heights over the possible region of interaction of the roughness of the tibia and the talus implant surfaces. A closed-form approximate equation relating contact force and minimum separation is used to obtain energy loss per cycle in a load-unload sequence applied to the implant. In this way implant surface statistics are related to energy loss in the implant that is responsible for internal void formation and subsequent wear and its harmful toxicity to the implant recipient.

  17. Climate change or climate cycles? Snowpack trends in the Olympic and Cascade Mountains, Washington, USA.

    PubMed

    Barry, Dwight; McDonald, Shea

    2013-01-01

    Climate change could significantly influence seasonal streamflow and water availability in the snowpack-fed watersheds of Washington, USA. Descriptions of snowpack decline often use linear ordinary least squares (OLS) models to quantify this change. However, the region's precipitation is known to be related to climate cycles. If snowpack decline is more closely related to these cycles, an OLS model cannot account for this effect, and thus both descriptions of trends and estimates of decline could be inaccurate. We used intervention analysis to determine whether snow water equivalent (SWE) in 25 long-term snow courses within the Olympic and Cascade Mountains are more accurately described by OLS (to represent gradual change), stationary (to represent no change), or step-stationary (to represent climate cycling) models. We used Bayesian information-theoretic methods to determine these models' relative likelihood, and we found 90 models that could plausibly describe the statistical structure of the 25 snow courses' time series. Posterior model probabilities of the 29 "most plausible" models ranged from 0.33 to 0.91 (mean = 0.58, s = 0.15). The majority of these time series (55%) were best represented as step-stationary models with a single breakpoint at 1976/77, coinciding with a major shift in the Pacific Decadal Oscillation. However, estimates of SWE decline differed by as much as 35% between statistically plausible models of a single time series. This ambiguity is a critical problem for water management policy. Approaches such as intervention analysis should become part of the basic analytical toolkit for snowpack or other climatic time series data.

  18. Cosmological Constraints from Fourier Phase Statistics

    NASA Astrophysics Data System (ADS)

    Ali, Kamran; Obreschkow, Danail; Howlett, Cullan; Bonvin, Camille; Llinares, Claudio; Oliveira Franco, Felipe; Power, Chris

    2018-06-01

    Most statistical inference from cosmic large-scale structure relies on two-point statistics, i.e. on the galaxy-galaxy correlation function (2PCF) or the power spectrum. These statistics capture the full information encoded in the Fourier amplitudes of the galaxy density field but do not describe the Fourier phases of the field. Here, we quantify the information contained in the line correlation function (LCF), a three-point Fourier phase correlation function. Using cosmological simulations, we estimate the Fisher information (at redshift z = 0) of the 2PCF, LCF and their combination, regarding the cosmological parameters of the standard ΛCDM model, as well as a Warm Dark Matter (WDM) model and the f(R) and Symmetron modified gravity models. The galaxy bias is accounted for at the level of a linear bias. The relative information of the 2PCF and the LCF depends on the survey volume, sampling density (shot noise) and the bias uncertainty. For a volume of 1h^{-3}Gpc^3, sampled with points of mean density \\bar{n} = 2× 10^{-3} h3 Mpc^{-3} and a bias uncertainty of 13%, the LCF improves the parameter constraints by about 20% in the ΛCDM cosmology and potentially even more in alternative models. Finally, since a linear bias only affects the Fourier amplitudes (2PCF), but not the phases (LCF), the combination of the 2PCF and the LCF can be used to break the degeneracy between the linear bias and σ8, present in 2-point statistics.

  19. Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

    PubMed

    Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

    2017-01-01

    If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.

  20. Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

    PubMed

    Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

    2015-08-01

    Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.

  1. Segmented regression analysis of interrupted time series data to assess outcomes of a South American road traffic alcohol policy change.

    PubMed

    Nistal-Nuño, Beatriz

    2017-09-01

    In Chile, a new law introduced in March 2012 decreased the legal blood alcohol concentration (BAC) limit for driving while impaired from 1 to 0.8 g/l and the legal BAC limit for driving under the influence of alcohol from 0.5 to 0.3 g/l. The goal is to assess the impact of this new law on mortality and morbidity outcomes in Chile. A review of national databases in Chile was conducted from January 2003 to December 2014. Segmented regression analysis of interrupted time series was used for analyzing the data. In a series of multivariable linear regression models, the change in intercept and slope in the monthly incidence rate of traffic deaths and injuries and association with alcohol per 100,000 inhabitants was estimated from pre-intervention to postintervention, while controlling for secular changes. In nested regression models, potential confounding seasonal effects were accounted for. All analyses were performed at a two-sided significance level of 0.05. Immediate level drops in all the monthly rates were observed after the law from the end of the prelaw period in the majority of models and in all the de-seasonalized models, although statistical significance was reached only in the model for injures related to alcohol. After the law, the estimated monthly rate dropped abruptly by -0.869 for injuries related to alcohol and by -0.859 adjusting for seasonality (P < 0.001). Regarding the postlaw long-term trends, it was evidenced a steeper decreasing trend after the law in the models for deaths related to alcohol, although these differences were not statistically significant. A strong evidence of a reduction in traffic injuries related to alcohol was found following the law in Chile. Although insufficient evidence was found of a statistically significant effect for the beneficial effects seen on deaths and overall injuries, potential clinically important effects cannot be ruled out. Copyright © 2017 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

  2. Generating survival times to simulate Cox proportional hazards models with time-varying covariates.

    PubMed

    Austin, Peter C

    2012-12-20

    Simulations and Monte Carlo methods serve an important role in modern statistical research. They allow for an examination of the performance of statistical procedures in settings in which analytic and mathematical derivations may not be feasible. A key element in any statistical simulation is the existence of an appropriate data-generating process: one must be able to simulate data from a specified statistical model. We describe data-generating processes for the Cox proportional hazards model with time-varying covariates when event times follow an exponential, Weibull, or Gompertz distribution. We consider three types of time-varying covariates: first, a dichotomous time-varying covariate that can change at most once from untreated to treated (e.g., organ transplant); second, a continuous time-varying covariate such as cumulative exposure at a constant dose to radiation or to a pharmaceutical agent used for a chronic condition; third, a dichotomous time-varying covariate with a subject being able to move repeatedly between treatment states (e.g., current compliance or use of a medication). In each setting, we derive closed-form expressions that allow one to simulate survival times so that survival times are related to a vector of fixed or time-invariant covariates and to a single time-varying covariate. We illustrate the utility of our closed-form expressions for simulating event times by using Monte Carlo simulations to estimate the statistical power to detect as statistically significant the effect of different types of binary time-varying covariates. This is compared with the statistical power to detect as statistically significant a binary time-invariant covariate. Copyright © 2012 John Wiley & Sons, Ltd.

  3. Structural Similarities between Brain and Linguistic Data Provide Evidence of Semantic Relations in the Brain

    PubMed Central

    Crangle, Colleen E.; Perreau-Guimaraes, Marcos; Suppes, Patrick

    2013-01-01

    This paper presents a new method of analysis by which structural similarities between brain data and linguistic data can be assessed at the semantic level. It shows how to measure the strength of these structural similarities and so determine the relatively better fit of the brain data with one semantic model over another. The first model is derived from WordNet, a lexical database of English compiled by language experts. The second is given by the corpus-based statistical technique of latent semantic analysis (LSA), which detects relations between words that are latent or hidden in text. The brain data are drawn from experiments in which statements about the geography of Europe were presented auditorily to participants who were asked to determine their truth or falsity while electroencephalographic (EEG) recordings were made. The theoretical framework for the analysis of the brain and semantic data derives from axiomatizations of theories such as the theory of differences in utility preference. Using brain-data samples from individual trials time-locked to the presentation of each word, ordinal relations of similarity differences are computed for the brain data and for the linguistic data. In each case those relations that are invariant with respect to the brain and linguistic data, and are correlated with sufficient statistical strength, amount to structural similarities between the brain and linguistic data. Results show that many more statistically significant structural similarities can be found between the brain data and the WordNet-derived data than the LSA-derived data. The work reported here is placed within the context of other recent studies of semantics and the brain. The main contribution of this paper is the new method it presents for the study of semantics and the brain and the focus it permits on networks of relations detected in brain data and represented by a semantic model. PMID:23799009

  4. Multiscale Modeling of Intergranular Fracture in Aluminum: Constitutive Relation For Interface Debonding

    NASA Technical Reports Server (NTRS)

    Yamakov, V.; Saether, E.; Glaessgen, E. H.

    2008-01-01

    Intergranular fracture is a dominant mode of failure in ultrafine grained materials. In the present study, the atomistic mechanisms of grain-boundary debonding during intergranular fracture in aluminum are modeled using a coupled molecular dynamics finite element simulation. Using a statistical mechanics approach, a cohesive-zone law in the form of a traction-displacement constitutive relationship, characterizing the load transfer across the plane of a growing edge crack, is extracted from atomistic simulations and then recast in a form suitable for inclusion within a continuum finite element model. The cohesive-zone law derived by the presented technique is free of finite size effects and is statistically representative for describing the interfacial debonding of a grain boundary (GB) interface examined at atomic length scales. By incorporating the cohesive-zone law in cohesive-zone finite elements, the debonding of a GB interface can be simulated in a coupled continuum-atomistic model, in which a crack starts in the continuum environment, smoothly penetrates the continuum-atomistic interface, and continues its propagation in the atomistic environment. This study is a step towards relating atomistically derived decohesion laws to macroscopic predictions of fracture and constructing multiscale models for nanocrystalline and ultrafine grained materials.

  5. An introduction to Bayesian statistics in health psychology.

    PubMed

    Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

    2017-09-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.

  6. Human-modified temperatures induce species changes: Joint attribution.

    PubMed

    Root, Terry L; MacMynowski, Dena P; Mastrandrea, Michael D; Schneider, Stephen H

    2005-05-24

    Average global surface-air temperature is increasing. Contention exists over relative contributions by natural and anthropogenic forcings. Ecological studies attribute plant and animal changes to observed warming. Until now, temperature-species connections have not been statistically attributed directly to anthropogenic climatic change. Using modeled climatic variables and observed species data, which are independent of thermometer records and paleoclimatic proxies, we demonstrate statistically significant "joint attribution," a two-step linkage: human activities contribute significantly to temperature changes and human-changed temperatures are associated with discernible changes in plant and animal traits. Additionally, our analyses provide independent testing of grid-box-scale temperature projections from a general circulation model (HadCM3).

  7. Sample size, confidence, and contingency judgement.

    PubMed

    Clément, Mélanie; Mercier, Pierre; Pastò, Luigi

    2002-06-01

    According to statistical models, the acquisition function of contingency judgement is due to confidence increasing with sample size. According to associative models, the function reflects the accumulation of associative strength on which the judgement is based. Which view is right? Thirty university students assessed the relation between a fictitious medication and a symptom of skin discoloration in conditions that varied sample size (4, 6, 8 or 40 trials) and contingency (delta P = .20, .40, .60 or .80). Confidence was also collected. Contingency judgement was lower for smaller samples, while confidence level correlated inversely with sample size. This dissociation between contingency judgement and confidence contradicts the statistical perspective.

  8. Direction dependence analysis: A framework to test the direction of effects in linear models with an implementation in SPSS.

    PubMed

    Wiedermann, Wolfgang; Li, Xintong

    2018-04-16

    In nonexperimental data, at least three possible explanations exist for the association of two variables x and y: (1) x is the cause of y, (2) y is the cause of x, or (3) an unmeasured confounder is present. Statistical tests that identify which of the three explanatory models fits best would be a useful adjunct to the use of theory alone. The present article introduces one such statistical method, direction dependence analysis (DDA), which assesses the relative plausibility of the three explanatory models on the basis of higher-moment information about the variables (i.e., skewness and kurtosis). DDA involves the evaluation of three properties of the data: (1) the observed distributions of the variables, (2) the residual distributions of the competing models, and (3) the independence properties of the predictors and residuals of the competing models. When the observed variables are nonnormally distributed, we show that DDA components can be used to uniquely identify each explanatory model. Statistical inference methods for model selection are presented, and macros to implement DDA in SPSS are provided. An empirical example is given to illustrate the approach. Conceptual and empirical considerations are discussed for best-practice applications in psychological data, and sample size recommendations based on previous simulation studies are provided.

  9. Using spatiotemporal statistical models to estimate animal abundance and infer ecological dynamics from survey counts

    USGS Publications Warehouse

    Conn, Paul B.; Johnson, Devin S.; Ver Hoef, Jay M.; Hooten, Mevin B.; London, Joshua M.; Boveng, Peter L.

    2015-01-01

    Ecologists often fit models to survey data to estimate and explain variation in animal abundance. Such models typically require that animal density remains constant across the landscape where sampling is being conducted, a potentially problematic assumption for animals inhabiting dynamic landscapes or otherwise exhibiting considerable spatiotemporal variation in density. We review several concepts from the burgeoning literature on spatiotemporal statistical models, including the nature of the temporal structure (i.e., descriptive or dynamical) and strategies for dimension reduction to promote computational tractability. We also review several features as they specifically relate to abundance estimation, including boundary conditions, population closure, choice of link function, and extrapolation of predicted relationships to unsampled areas. We then compare a suite of novel and existing spatiotemporal hierarchical models for animal count data that permit animal density to vary over space and time, including formulations motivated by resource selection and allowing for closed populations. We gauge the relative performance (bias, precision, computational demands) of alternative spatiotemporal models when confronted with simulated and real data sets from dynamic animal populations. For the latter, we analyze spotted seal (Phoca largha) counts from an aerial survey of the Bering Sea where the quantity and quality of suitable habitat (sea ice) changed dramatically while surveys were being conducted. Simulation analyses suggested that multiple types of spatiotemporal models provide reasonable inference (low positive bias, high precision) about animal abundance, but have potential for overestimating precision. Analysis of spotted seal data indicated that several model formulations, including those based on a log-Gaussian Cox process, had a tendency to overestimate abundance. By contrast, a model that included a population closure assumption and a scale prior on total abundance produced estimates that largely conformed to our a priori expectation. Although care must be taken to tailor models to match the study population and survey data available, we argue that hierarchical spatiotemporal statistical models represent a powerful way forward for estimating abundance and explaining variation in the distribution of dynamical populations.

  10. Markov switching multinomial logit model: An application to accident-injury severities.

    PubMed

    Malyshkina, Nataliya V; Mannering, Fred L

    2009-07-01

    In this study, two-state Markov switching multinomial logit models are proposed for statistical modeling of accident-injury severities. These models assume Markov switching over time between two unobserved states of roadway safety as a means of accounting for potential unobserved heterogeneity. The states are distinct in the sense that in different states accident-severity outcomes are generated by separate multinomial logit processes. To demonstrate the applicability of the approach, two-state Markov switching multinomial logit models are estimated for severity outcomes of accidents occurring on Indiana roads over a four-year time period. Bayesian inference methods and Markov Chain Monte Carlo (MCMC) simulations are used for model estimation. The estimated Markov switching models result in a superior statistical fit relative to the standard (single-state) multinomial logit models for a number of roadway classes and accident types. It is found that the more frequent state of roadway safety is correlated with better weather conditions and that the less frequent state is correlated with adverse weather conditions.

  11. The Hanford Thyroid Disease Study: an alternative view of the findings.

    PubMed

    Hoffman, F Owen; Ruttenber, A James; Apostoaei, A Iulian; Carroll, Raymond J; Greenland, Sander

    2007-02-01

    The Hanford Thyroid Disease Study (HTDS) is one of the largest and most complex epidemiologic studies of the relation between environmental exposures to I and thyroid disease. The study detected no dose-response relation using a 0.05 level for statistical significance. The results for thyroid cancer appear inconsistent with those from other studies of populations with similar exposures, and either reflect inadequate statistical power, bias, or unique relations between exposure and disease risk. In this paper, we explore these possibilities, and present evidence that the HTDS statistical power was inadequate due to complex uncertainties associated with the mathematical models and assumptions used to reconstruct individual doses. We conclude that, at the very least, the confidence intervals reported by the HTDS for thyroid cancer and other thyroid diseases are too narrow because they fail to reflect key uncertainties in the measurement-error structure. We recommend that the HTDS results be interpreted as inconclusive rather than as evidence for little or no disease risk from Hanford exposures.

  12. Analysis and Comparison with DNS of a Stochastic Model for the Relative Motion of High-Stokes-Number Particles in Isotropic Turbulence

    NASA Astrophysics Data System (ADS)

    Dhariwal, Rohit; Rani, Sarma; Koch, Donald

    2015-11-01

    In an earlier work, Rani, Dhariwal, and Koch (JFM, Vol. 756, 2014) developed an analytical closure for the diffusion current in the PDF transport equation describing the relative motion of high-Stokes-number particle pairs in isotropic turbulence. In this study, an improved closure was developed for the diffusion coefficient, such that the motion of the particle-pair center of mass is taken into account. Using the earlier and the new analytical closures, Langevin simulations of pair relative motion were performed for four particle Stokes numbers, Stη = 10 , 20 , 40 , 80 and at two Taylor micro-scale Reynolds numbers Reλ = 76 , 131 . Detailed comparisons of the analytical model predictions with those of DNS were undertaken. It is seen that the pair relative motion statistics obtained from the improved theory show excellent agreement with the DNS statistics. The radial distribution functions (RDFs), and relative velocity PDFs obtained from the improved-closure-based Langevin simulations are found to be in very good agreement with those from DNS. It was found that the RDFs and relative velocity RMS increased with Reλ for all Stη . The collision kernel also increased strongly with Reλ , since it depended on the RDF and the radial relative velocities.

  13. Leave taking and overtime behavior as related to demographic, health, and job variables

    NASA Technical Reports Server (NTRS)

    Arnoldi, L. B.; Townsend, J. C.

    1969-01-01

    An intra-installation model is formulated that correlates demographic, health and job related variables to the various types and amounts of leave and overtime taking behavior of employees. Statistical comparison of composite health ratings assigned to subjects based upon clinical criteria and bio-statistical data show that those employees who take the most annual leave as well as sick leave are the ones that have the poorest health ratings; employees who put in the most overtime have also the poorest health records. Stress effects of peak activity periods increase use of sick leave immediately after peak activity but not the use of annual leave.

  14. Synthetic Earthquake Statistics From Physical Fault Models for the Lower Rhine Embayment

    NASA Astrophysics Data System (ADS)

    Brietzke, G. B.; Hainzl, S.; Zöller, G.

    2012-04-01

    As of today, seismic risk and hazard estimates mostly use pure empirical, stochastic models of earthquake fault systems tuned specifically to the vulnerable areas of interest. Although such models allow for reasonable risk estimates they fail to provide a link between the observed seismicity and the underlying physical processes. Solving a state-of-the-art fully dynamic description set of all relevant physical processes related to earthquake fault systems is likely not useful since it comes with a large number of degrees of freedom, poor constraints on its model parameters and a huge computational effort. Here, quasi-static and quasi-dynamic physical fault simulators provide a compromise between physical completeness and computational affordability and aim at providing a link between basic physical concepts and statistics of seismicity. Within the framework of quasi-static and quasi-dynamic earthquake simulators we investigate a model of the Lower Rhine Embayment (LRE) that is based upon seismological and geological data. We present and discuss statistics of the spatio-temporal behavior of generated synthetic earthquake catalogs with respect to simplification (e.g. simple two-fault cases) as well as to complication (e.g. hidden faults, geometric complexity, heterogeneities of constitutive parameters).

  15. Encoding Dissimilarity Data for Statistical Model Building.

    PubMed

    Wahba, Grace

    2010-12-01

    We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.

  16. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA

    PubMed Central

    Salter-Townshend, Michael; McCormick, Tyler H.

    2018-01-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)]. PMID:29721127

  17. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA.

    PubMed

    Salter-Townshend, Michael; McCormick, Tyler H

    2017-09-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090-1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

  18. Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective

    USGS Publications Warehouse

    Barker, Richard J.; Link, William A.

    2015-01-01

    Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.

  19. Long-term observations minus background monitoring of ground-based brightness temperatures from a microwave radiometer network

    NASA Astrophysics Data System (ADS)

    De Angelis, Francesco; Cimini, Domenico; Löhnert, Ulrich; Caumont, Olivier; Haefele, Alexander; Pospichal, Bernhard; Martinet, Pauline; Navas-Guzmán, Francisco; Klein-Baltink, Henk; Dupont, Jean-Charles; Hocking, James

    2017-10-01

    Ground-based microwave radiometers (MWRs) offer the capability to provide continuous, high-temporal-resolution observations of the atmospheric thermodynamic state in the planetary boundary layer (PBL) with low maintenance. This makes MWR an ideal instrument to supplement radiosonde and satellite observations when initializing numerical weather prediction (NWP) models through data assimilation. State-of-the-art data assimilation systems (e.g. variational schemes) require an accurate representation of the differences between model (background) and observations, which are then weighted by their respective errors to provide the best analysis of the true atmospheric state. In this perspective, one source of information is contained in the statistics of the differences between observations and their background counterparts (O-B). Monitoring of O-B statistics is crucial to detect and remove systematic errors coming from the measurements, the observation operator, and/or the NWP model. This work illustrates a 1-year O-B analysis for MWR observations in clear-sky conditions for an European-wide network of six MWRs. Observations include MWR brightness temperatures (TB) measured by the two most common types of MWR instruments. Background profiles are extracted from the French convective-scale model AROME-France before being converted into TB. The observation operator used to map atmospheric profiles into TB is the fast radiative transfer model RTTOV-gb. It is shown that O-B monitoring can effectively detect instrument malfunctions. O-B statistics (bias, standard deviation, and root mean square) for water vapour channels (22.24-30.0 GHz) are quite consistent for all the instrumental sites, decreasing from the 22.24 GHz line centre ( ˜ 2-2.5 K) towards the high-frequency wing ( ˜ 0.8-1.3 K). Statistics for zenith and lower-elevation observations show a similar trend, though values increase with increasing air mass. O-B statistics for temperature channels show different behaviour for relatively transparent (51-53 GHz) and opaque channels (54-58 GHz). Opaque channels show lower uncertainties (< 0.8-0.9 K) and little variation with elevation angle. Transparent channels show larger biases ( ˜ 2-3 K) with relatively low standard deviations ( ˜ 1-1.5 K). The observations minus analysis TB statistics are similar to the O-B statistics, suggesting a possible improvement to be expected by assimilating MWR TB into NWP models. Lastly, the O-B TB differences have been evaluated to verify the normal-distribution hypothesis underlying variational and ensemble Kalman filter-based DA systems. Absolute values of excess kurtosis and skewness are generally within 1 and 0.5, respectively, for all instrumental sites, demonstrating O-B normal distribution for most of the channels and elevations angles.

  20. A detailed heterogeneous agent model for a single asset financial market with trading via an order book.

    PubMed

    Mota Navarro, Roberto; Larralde, Hernán

    2017-01-01

    We present an agent based model of a single asset financial market that is capable of replicating most of the non-trivial statistical properties observed in real financial markets, generically referred to as stylized facts. In our model agents employ strategies inspired on those used in real markets, and a realistic trade mechanism based on a double auction order book. We study the role of the distinct types of trader on the return statistics: specifically, correlation properties (or lack thereof), volatility clustering, heavy tails, and the degree to which the distribution can be described by a log-normal. Further, by introducing the practice of "profit taking", our model is also capable of replicating the stylized fact related to an asymmetry in the distribution of losses and gains.

  1. Nowcasting of Low-Visibility Procedure States with Ordered Logistic Regression at Vienna International Airport

    NASA Astrophysics Data System (ADS)

    Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim

    2017-04-01

    Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.

  2. A detailed heterogeneous agent model for a single asset financial market with trading via an order book

    PubMed Central

    2017-01-01

    We present an agent based model of a single asset financial market that is capable of replicating most of the non-trivial statistical properties observed in real financial markets, generically referred to as stylized facts. In our model agents employ strategies inspired on those used in real markets, and a realistic trade mechanism based on a double auction order book. We study the role of the distinct types of trader on the return statistics: specifically, correlation properties (or lack thereof), volatility clustering, heavy tails, and the degree to which the distribution can be described by a log-normal. Further, by introducing the practice of “profit taking”, our model is also capable of replicating the stylized fact related to an asymmetry in the distribution of losses and gains. PMID:28245251

  3. Study on elevated-temperature flow behavior of Ni-Cr-Mo-B ultra-heavy-plate steel via experiment and modelling

    NASA Astrophysics Data System (ADS)

    Gao, Zhi-yu; Kang, Yu; Li, Yan-shuai; Meng, Chao; Pan, Tao

    2018-04-01

    Elevated-temperature flow behavior of a novel Ni-Cr-Mo-B ultra-heavy-plate steel was investigated by conducting hot compressive deformation tests on a Gleeble-3800 thermo-mechanical simulator at a temperature range of 1123 K–1423 K with a strain rate range from 0.01 s‑1 to10 s‑1 and a height reduction of 70%. Based on the experimental results, classic strain-compensated Arrhenius-type, a new revised strain-compensated Arrhenius-type and classic modified Johnson-Cook constitutive models were developed for predicting the high-temperature deformation behavior of the steel. The predictability of these models were comparatively evaluated in terms of statistical parameters including correlation coefficient (R), average absolute relative error (AARE), average root mean square error (RMSE), normalized mean bias error (NMBE) and relative error. The statistical results indicate that the new revised strain-compensated Arrhenius-type model could give prediction of elevated-temperature flow stress for the steel accurately under the entire process conditions. However, the predicted values by the classic modified Johnson-Cook model could not agree well with the experimental values, and the classic strain-compensated Arrhenius-type model could track the deformation behavior more accurately compared with the modified Johnson-Cook model, but less accurately with the new revised strain-compensated Arrhenius-type model. In addition, reasons of differences in predictability of these models were discussed in detail.

  4. An R2 statistic for fixed effects in the linear mixed model.

    PubMed

    Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver

    2008-12-20

    Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.

  5. A reliability study on brain activation during active and passive arm movements supported by an MRI-compatible robot.

    PubMed

    Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros

    2014-11-01

    In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.

  6. The Meyer-Neldel rule and the statistical shift of the Fermi level in amorphous semiconductors

    NASA Astrophysics Data System (ADS)

    Kikuchi, Minoru

    1988-11-01

    The statistical model is used to study the origin of the Meyer-Neldel (MN) rule [σ0∝exp(AEσ)] in a tetrahedral amorphous system. It is shown that a deep minimum in the gap density of states spectrum can lead to the linearity of the Fermi energy F(T) to the derivative (dF/dkT), as required from the rule. An expression is derived which relates the constant A in the rule to the gap density of states spectrum. The dispersion ranges of σ0 and Eσ are found to be related with the constant A. Model calculations show a magnitude of A and a wide dispersion of σ0 and Eσ in fair agreement with the experimental observations. A discussion is given to what extent the MN rule is dependent on the gap density of states spectrum.

  7. P values are only an index to evidence: 20th- vs. 21st-century statistical science.

    PubMed

    Burnham, K P; Anderson, D R

    2014-03-01

    Early statistical methods focused on pre-data probability statements (i.e., data as random variables) such as P values; these are not really inferences nor are P values evidential. Statistical science clung to these principles throughout much of the 20th century as a wide variety of methods were developed for special cases. Looking back, it is clear that the underlying paradigm (i.e., testing and P values) was weak. As Kuhn (1970) suggests, new paradigms have taken the place of earlier ones: this is a goal of good science. New methods have been developed and older methods extended and these allow proper measures of strength of evidence and multimodel inference. It is time to move forward with sound theory and practice for the difficult practical problems that lie ahead. Given data the useful foundation shifts to post-data probability statements such as model probabilities (Akaike weights) or related quantities such as odds ratios and likelihood intervals. These new methods allow formal inference from multiple models in the a prior set. These quantities are properly evidential. The past century was aimed at finding the "best" model and making inferences from it. The goal in the 21st century is to base inference on all the models weighted by their model probabilities (model averaging). Estimates of precision can include model selection uncertainty leading to variances conditional on the model set. The 21st century will be about the quantification of information, proper measures of evidence, and multi-model inference. Nelder (1999:261) concludes, "The most important task before us in developing statistical science is to demolish the P-value culture, which has taken root to a frightening extent in many areas of both pure and applied science and technology".

  8. The relationship between procrastination, learning strategies and statistics anxiety among Iranian college students: a canonical correlation analysis.

    PubMed

    Vahedi, Shahrum; Farrokhi, Farahman; Gahramani, Farahnaz; Issazadegan, Ali

    2012-01-01

    Approximately 66-80%of graduate students experience statistics anxiety and some researchers propose that many students identify statistics courses as the most anxiety-inducing courses in their academic curriculums. As such, it is likely that statistics anxiety is, in part, responsible for many students delaying enrollment in these courses for as long as possible. This paper proposes a canonical model by treating academic procrastination (AP), learning strategies (LS) as predictor variables and statistics anxiety (SA) as explained variables. A questionnaire survey was used for data collection and 246-college female student participated in this study. To examine the mutually independent relations between procrastination, learning strategies and statistics anxiety variables, a canonical correlation analysis was computed. Findings show that two canonical functions were statistically significant. The set of variables (metacognitive self-regulation, source management, preparing homework, preparing for test and preparing term papers) helped predict changes of statistics anxiety with respect to fearful behavior, Attitude towards math and class, Performance, but not Anxiety. These findings could be used in educational and psychological interventions in the context of statistics anxiety reduction.

  9. The contribution of executive functions to emergent mathematic skills in preschool children.

    PubMed

    Espy, Kimberly Andrews; McDiarmid, Melanie M; Cwik, Mary F; Stalets, Melissa Meade; Hamby, Arlena; Senn, Theresa E

    2004-01-01

    Mathematical ability is related to both activation of the prefrontal cortex in neuroimaging studies of adults and to executive functions in school-age children. The purpose of this study was to determine whether executive functions were related to emergent mathematical proficiency in preschool children. Preschool children (N = 96) were administered an executive function battery that was reduced empirically to working memory (WM), inhibitory control (IC), and shifting abilities by calculating composite scores derived from principal component analysis. Both WM and IC predicted early arithmetic competency, with the observed relations robust after controlling statistically for child age, maternal education, and child vocabulary. Only IC accounted for unique variance in mathematical skills, after the contribution of other executive functions were controlled statistically as well. Specific executive functions are related to emergent mathematical proficiency in this age range. Longitudinal studies using structural equation modeling are necessary to better characterize these ontogenetic relations.

  10. On some stochastic formulations and related statistical moments of pharmacokinetic models.

    PubMed

    Matis, J H; Wehrly, T E; Metzler, C M

    1983-02-01

    This paper presents the deterministic and stochastic model for a linear compartment system with constant coefficients, and it develops expressions for the mean residence times (MRT) and the variances of the residence times (VRT) for the stochastic model. The expressions are relatively simple computationally, involving primarily matrix inversion, and they are elegant mathematically, in avoiding eigenvalue analysis and the complex domain. The MRT and VRT provide a set of new meaningful response measures for pharmacokinetic analysis and they give added insight into the system kinetics. The new analysis is illustrated with an example involving the cholesterol turnover in rats.

  11. Inter-model Diversity of ENSO simulation and its relation to basic states

    NASA Astrophysics Data System (ADS)

    Kug, J. S.; Ham, Y. G.

    2016-12-01

    In this study, a new methodology is developed to improve the climate simulation of state-of-the-art coupledglobal climate models (GCMs), by a postprocessing based on the intermodel diversity. Based on the closeconnection between the interannual variability and climatological states, the distinctive relation between theintermodel diversity of the interannual variability and that of the basic state is found. Based on this relation,the simulated interannual variabilities can be improved, by correcting their climatological bias. To test thismethodology, the dominant intermodel difference in precipitation responses during El Niño-SouthernOscillation (ENSO) is investigated, and its relationship with climatological state. It is found that the dominantintermodel diversity of the ENSO precipitation in phase 5 of the Coupled Model Intercomparison Project(CMIP5) is associated with the zonal shift of the positive precipitation center during El Niño. This dominantintermodel difference is significantly correlated with the basic states. The models with wetter (dryer) climatologythan the climatology of the multimodel ensemble (MME) over the central Pacific tend to shift positiveENSO precipitation anomalies to the east (west). Based on the model's systematic errors in atmosphericENSO response and bias, the models with better climatological state tend to simulate more realistic atmosphericENSO responses.Therefore, the statistical method to correct the ENSO response mostly improves the ENSO response. Afterthe statistical correction, simulating quality of theMMEENSO precipitation is distinctively improved. Theseresults provide a possibility that the present methodology can be also applied to improving climate projectionand seasonal climate prediction.

  12. Components of spatial information management in wildlife ecology: Software for statistical and modeling analysis [Chapter 14

    Treesearch

    Hawthorne L. Beyer; Jeff Jenness; Samuel A. Cushman

    2010-01-01

    Spatial information systems (SIS) is a term that describes a wide diversity of concepts, techniques, and technologies related to the capture, management, display and analysis of spatial information. It encompasses technologies such as geographic information systems (GIS), global positioning systems (GPS), remote sensing, and relational database management systems (...

  13. Arts Education Advocacy: The Relative Effects of School-Level Influences on Resources for Arts Education

    ERIC Educational Resources Information Center

    Miksza, Peter

    2013-01-01

    The purpose of this study was to investigate advocacy influences that may impact school arts programs using data from the 2009-10 National Center for Education Statistics elementary and secondary school surveys on arts education. Regression models were employed to assess the relative effectiveness of variables representing community support,…

  14. Contrasting effects of feature-based statistics on the categorisation and identification of visual objects

    PubMed Central

    Taylor, Kirsten I.; Devereux, Barry J.; Acres, Kadia; Randall, Billi; Tyler, Lorraine K.

    2013-01-01

    Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. PMID:22137770

  15. Relative Contributions of Agricultural Drift, Para-Occupational ...

    EPA Pesticide Factsheets

    Background: Increased pesticide concentrations in house dust in agricultural areas have been attributed to several exposure pathways, including agricultural drift, para-occupational, and residential use. Objective: To guide future exposure assessment efforts, we quantified relative contributions of these pathways using meta-regression models of published data on dust pesticide concentrations. Methods: From studies in North American agricultural areas published from 1995-2015, we abstracted dust pesticide concentrations reported as summary statistics (e.g., geometric means (GM)). We analyzed these data using mixed-effects meta-regression models that weighted each summary statistic by its inverse variance. Dependent variables were either the log-transformed GM (drift) or the log-transformed ratio of GMs from two groups (para-occupational, residential use). Results: For the drift pathway, predicted GMs decreased sharply and nonlinearly, with GMs 64% lower in homes 250 m versus 23 m from fields (inter-quartile range of published data) based on 52 statistics from 7 studies. For the para-occupational pathway, GMs were 2.3 times higher (95% confidence interval [CI]: 1.5-3.3; 15 statistics, 5 studies) in homes of farmers who applied pesticides more versus less recently or frequently. For the residential use pathway, GMs were 1.3 (95%CI: 1.1-1.4) and 1.5 (95%CI: 1.2-1.9) times higher in treated versus untreated homes, when the probability that a pesticide was used for

  16. Localized Smart-Interpretation

    NASA Astrophysics Data System (ADS)

    Lundh Gulbrandsen, Mats; Mejer Hansen, Thomas; Bach, Torben; Pallesen, Tom

    2014-05-01

    The complex task of setting up a geological model consists not only of combining available geological information into a conceptual plausible model, but also requires consistency with availably data, e.g. geophysical data. However, in many cases the direct geological information, e.g borehole samples, are very sparse, so in order to create a geological model, the geologist needs to rely on the geophysical data. The problem is however, that the amount of geophysical data in many cases are so vast that it is practically impossible to integrate all of them in the manual interpretation process. This means that a lot of the information available from the geophysical surveys are unexploited, which is a problem, due to the fact that the resulting geological model does not fulfill its full potential and hence are less trustworthy. We suggest an approach to geological modeling that 1. allow all geophysical data to be considered when building the geological model 2. is fast 3. allow quantification of geological modeling. The method is constructed to build a statistical model, f(d,m), describing the relation between what the geologists interpret, d, and what the geologist knows, m. The para- meter m reflects any available information that can be quantified, such as geophysical data, the result of a geophysical inversion, elevation maps, etc... The parameter d reflects an actual interpretation, such as for example the depth to the base of a ground water reservoir. First we infer a statistical model f(d,m), by examining sets of actual interpretations made by a geological expert, [d1, d2, ...], and the information used to perform the interpretation; [m1, m2, ...]. This makes it possible to quantify how the geological expert performs interpolation through f(d,m). As the geological expert proceeds interpreting, the number of interpreted datapoints from which the statistical model is inferred increases, and therefore the accuracy of the statistical model increases. When a model f(d,m) successfully has been inferred, we are able to simulate how the geological expert would perform an interpretation given some external information m, through f(d|m). We will demonstrate this method applied on geological interpretation and densely sampled airborne electromagnetic data. In short, our goal is to build a statistical model describing how a geological expert performs geological interpretation given some geophysical data. We then wish to use this statistical model to perform semi automatic interpretation, everywhere where such geophysical data exist, in a manner consistent with the choices made by a geological expert. Benefits of such a statistical model are that 1. it provides a quantification of how a geological expert performs interpretation based on available diverse data 2. all available geophysical information can be used 3. it allows much faster interpretation of large data sets.

  17. Regional analyses of labor markets and demography: a model based Norwegian example.

    PubMed

    Stambol, L S; Stolen, N M; Avitsland, T

    1998-01-01

    The authors discuss the regional REGARD model, developed by Statistics Norway to analyze the regional implications of macroeconomic development of employment, labor force, and unemployment. "In building the model, empirical analyses of regional producer behavior in manufacturing industries have been performed, and the relation between labor market development and regional migration has been investigated. Apart from providing a short description of the REGARD model, this article demonstrates the functioning of the model, and presents some results of an application." excerpt

  18. Fall 2014 SEI Research Review Probabilistic Analysis of Time Sensitive Systems

    DTIC Science & Technology

    2014-10-28

    Osmosis SMC Tool Osmosis is a tool for Statistical Model Checking (SMC) with Semantic Importance Sampling. • Input model is written in subset of C...ASSERT() statements in model indicate conditions that must hold. • Input probability distributions defined by the user. • Osmosis returns the...on: – Target relative error, or – Set number of simulations Osmosis Main Algorithm 1 http://dreal.cs.cmu.edu/ (?⃑?): Indicator

  19. Comparing and combining process-based crop models and statistical models with some implications for climate change

    NASA Astrophysics Data System (ADS)

    Roberts, Michael J.; Braun, Noah O.; Sinclair, Thomas R.; Lobell, David B.; Schlenker, Wolfram

    2017-09-01

    We compare predictions of a simple process-based crop model (Soltani and Sinclair 2012), a simple statistical model (Schlenker and Roberts 2009), and a combination of both models to actual maize yields on a large, representative sample of farmer-managed fields in the Corn Belt region of the United States. After statistical post-model calibration, the process model (Simple Simulation Model, or SSM) predicts actual outcomes slightly better than the statistical model, but the combined model performs significantly better than either model. The SSM, statistical model and combined model all show similar relationships with precipitation, while the SSM better accounts for temporal patterns of precipitation, vapor pressure deficit and solar radiation. The statistical and combined models show a more negative impact associated with extreme heat for which the process model does not account. Due to the extreme heat effect, predicted impacts under uniform climate change scenarios are considerably more severe for the statistical and combined models than for the process-based model.

  20. Generating action descriptions from statistically integrated representations of human motions and sentences.

    PubMed

    Takano, Wataru; Kusajima, Ikuo; Nakamura, Yoshihiko

    2016-08-01

    It is desirable for robots to be able to linguistically understand human actions during human-robot interactions. Previous research has developed frameworks for encoding human full body motion into model parameters and for classifying motion into specific categories. For full understanding, the motion categories need to be connected to the natural language such that the robots can interpret human motions as linguistic expressions. This paper proposes a novel framework for integrating observation of human motion with that of natural language. This framework consists of two models; the first model statistically learns the relations between motions and their relevant words, and the second statistically learns sentence structures as word n-grams. Integration of these two models allows robots to generate sentences from human motions by searching for words relevant to the motion using the first model and then arranging these words in appropriate order using the second model. This allows making sentences that are the most likely to be generated from the motion. The proposed framework was tested on human full body motion measured by an optical motion capture system. In this, descriptive sentences were manually attached to the motions, and the validity of the system was demonstrated. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Variety and volatility in financial markets

    NASA Astrophysics Data System (ADS)

    Lillo, Fabrizio; Mantegna, Rosario N.

    2000-11-01

    We study the price dynamics of stocks traded in a financial market by considering the statistical properties of both a single time series and an ensemble of stocks traded simultaneously. We use the n stocks traded on the New York Stock Exchange to form a statistical ensemble of daily stock returns. For each trading day of our database, we study the ensemble return distribution. We find that a typical ensemble return distribution exists in most of the trading days with the exception of crash and rally days and of the days following these extreme events. We analyze each ensemble return distribution by extracting its first two central moments. We observe that these moments fluctuate in time and are stochastic processes, themselves. We characterize the statistical properties of ensemble return distribution central moments by investigating their probability density functions and temporal correlation properties. In general, time-averaged and portfolio-averaged price returns have different statistical properties. We infer from these differences information about the relative strength of correlation between stocks and between different trading days. Last, we compare our empirical results with those predicted by the single-index model and we conclude that this simple model cannot explain the statistical properties of the second moment of the ensemble return distribution.

  2. An Assessment of Land Surface and Lightning Characteristics Associated with Lightning-Initiated Wildfires

    NASA Technical Reports Server (NTRS)

    Coy, James; Schultz, Christopher J.; Case, Jonathan L.

    2017-01-01

    Can we use modeled information of the land surface and characteristics of lightning beyond flash occurrence to increase the identification and prediction of wildfires? Combine observed cloud-to-ground (CG) flashes with real-time land surface model output, and Compare data with areas where lightning did not start a wildfire to determine what land surface conditions and lightning characteristics were responsible for causing wildfires. Statistical differences between suspected fire-starters and non-fire-starters were peak-current dependent 0-10 cm Volumetric and Relative Soil Moisture comparisons were statistically dependent to at least the p = 0.05 independence level for both polarity flash types Suspected fire-starters typically occurred in areas of lower soil moisture than non-fire-starters. GVF value comparisons were only found to be statistically dependent for -CG flashes. However, random sampling of the -CG non-fire starter dataset revealed that this relationship may not always hold.

  3. Statistical text classifier to detect specific type of medical incidents.

    PubMed

    Wong, Zoie Shui-Yee; Akiyama, Masanori

    2013-01-01

    WHO Patient Safety has put focus to increase the coherence and expressiveness of patient safety classification with the foundation of International Classification for Patient Safety (ICPS). Text classification and statistical approaches has showed to be successful to identifysafety problems in the Aviation industryusing incident text information. It has been challenging to comprehend the taxonomy of medical incidents in a structured manner. Independent reporting mechanisms for patient safety incidents have been established in the UK, Canada, Australia, Japan, Hong Kong etc. This research demonstrates the potential to construct statistical text classifiers to detect specific type of medical incidents using incident text data. An illustrative example for classifying look-alike sound-alike (LASA) medication incidents using structured text from 227 advisories related to medication errors from Global Patient Safety Alerts (GPSA) is shown in this poster presentation. The classifier was built using logistic regression model. ROC curve and the AUC value indicated that this is a satisfactory good model.

  4. Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping.

    PubMed

    Kubica, Aleksander; Beverland, Michael E; Brandão, Fernando; Preskill, John; Svore, Krysta M

    2018-05-04

    Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p_{3DCC}^{(1)}≃1.9% and p_{3DCC}^{(2)}≃27.6%. We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.

  5. Comparison of U-spatial statistics and C-A fractal models for delineating anomaly patterns of porphyry-type Cu geochemical signatures in the Varzaghan district, NW Iran

    NASA Astrophysics Data System (ADS)

    Ghezelbash, Reza; Maghsoudi, Abbas

    2018-05-01

    The delineation of populations of stream sediment geochemical data is a crucial task in regional exploration surveys. In this contribution, uni-element stream sediment geochemical data of Cu, Au, Mo, and Bi have been subjected to two reliable anomaly-background separation methods, namely, the concentration-area (C-A) fractal and the U-spatial statistics methods to separate geochemical anomalies related to porphyry-type Cu mineralization in northwest Iran. The quantitative comparison of the delineated geochemical populations using the modified success-rate curves revealed the superiority of the U-spatial statistics method over the fractal model. Moreover, geochemical maps of investigated elements revealed strongly positive correlations between strong anomalies and Oligocene-Miocene intrusions in the study area. Therefore, follow-up exploration programs should focus on these areas.

  6. Empirical-statistical downscaling of reanalysis data to high-resolution air temperature and specific humidity above a glacier surface (Cordillera Blanca, Peru)

    NASA Astrophysics Data System (ADS)

    Hofer, Marlis; MöLg, Thomas; Marzeion, Ben; Kaser, Georg

    2010-06-01

    Recently initiated observation networks in the Cordillera Blanca (Peru) provide temporally high-resolution, yet short-term, atmospheric data. The aim of this study is to extend the existing time series into the past. We present an empirical-statistical downscaling (ESD) model that links 6-hourly National Centers for Environmental Prediction (NCEP)/National Center for Atmospheric Research (NCAR) reanalysis data to air temperature and specific humidity, measured at the tropical glacier Artesonraju (northern Cordillera Blanca). The ESD modeling procedure includes combined empirical orthogonal function and multiple regression analyses and a double cross-validation scheme for model evaluation. Apart from the selection of predictor fields, the modeling procedure is automated and does not include subjective choices. We assess the ESD model sensitivity to the predictor choice using both single-field and mixed-field predictors. Statistical transfer functions are derived individually for different months and times of day. The forecast skill largely depends on month and time of day, ranging from 0 to 0.8. The mixed-field predictors perform better than the single-field predictors. The ESD model shows added value, at all time scales, against simpler reference models (e.g., the direct use of reanalysis grid point values). The ESD model forecast 1960-2008 clearly reflects interannual variability related to the El Niño/Southern Oscillation but is sensitive to the chosen predictor type.

  7. Capturing the DSM-5 Alternative Personality Disorder Model Traits in the Five-Factor Model's Nomological Net.

    PubMed

    Suzuki, Takakuni; Griffin, Sarah A; Samuel, Douglas B

    2017-04-01

    Several studies have shown structural and statistical similarities between the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) alternative personality disorder model and the Five-Factor Model (FFM). However, no study to date has evaluated the nomological network similarities between the two models. The relations of the Revised NEO Personality Inventory (NEO PI-R) and the Personality Inventory for DSM-5 (PID-5) with relevant criterion variables were examined in a sample of 336 undergraduate students (M age  = 19.4; 59.8% female). The resulting profiles for each instrument were statistically compared for similarity. Four of the five domains of the two models have highly similar nomological networks, with the exception being FFM Openness to Experience and PID-5 Psychoticism. Further probing of that pair suggested that the NEO PI-R domain scores obscured meaningful similarity between PID-5 Psychoticism and specific aspects and lower-order facets of Openness. The results support the notion that the DSM-5 alternative personality disorder model trait domains represent variants of the FFM domains. Similarities of Openness and Psychoticism domains were supported when the lower-order aspects and facets of Openness domain were considered. The findings support the view that the DSM-5 trait model represents an instantiation of the FFM. © 2015 Wiley Periodicals, Inc.

  8. Biophysical model for assessment of risk of acute exposures in combination with low level chronic irradiation

    NASA Astrophysics Data System (ADS)

    Smirnova, O. A.

    A biophysical model is developed which describes the mortality dynamics in mammalian populations unexposed and exposed to radiation The model relates statistical biometric functions mortality rate life span probability density and life span probability with statistical characteristics and dynamics of a critical body system in individuals composing the population The model describing the dynamics of thrombocytopoiesis in nonirradiated and irradiated mammals is also developed this hematopoietic line being considered as the critical body system under exposures in question The mortality model constructed in the framework of the proposed approach was identified to reproduce the irradiation effects on populations of mice The most parameters of the thrombocytopoiesis model were determined from the data available in the literature on hematology and radiobiology the rest parameters were evaluated by fitting some experimental data on the dynamics of this system in acutely irradiated mice The successful verification of the thrombocytopoiesis model was fulfilled by the quantitative juxtaposition of the modeling predictions and experimental data on the dynamics of this system in mice exposed to either acute or chronic irradiation at wide ranges of doses and dose rates It is important that only experimental data on the mortality rate in nonirradiated population and the relevant statistical characteristics of the thrombocytopoiesis system in mice which are also available in the literature on radiobiology are needed for the final identification of

  9. Accurately Characterizing the Importance of Wave-Particle Interactions in Radiation Belt Dynamics: The Pitfalls of Statistical Wave Representations

    NASA Technical Reports Server (NTRS)

    Murphy, Kyle R.; Mann, Ian R.; Rae, I. Jonathan; Sibeck, David G.; Watt, Clare E. J.

    2016-01-01

    Wave-particle interactions play a crucial role in energetic particle dynamics in the Earths radiation belts. However, the relative importance of different wave modes in these dynamics is poorly understood. Typically, this is assessed during geomagnetic storms using statistically averaged empirical wave models as a function of geomagnetic activity in advanced radiation belt simulations. However, statistical averages poorly characterize extreme events such as geomagnetic storms in that storm-time ultralow frequency wave power is typically larger than that derived over a solar cycle and Kp is a poor proxy for storm-time wave power.

  10. Ballistic and diffusive dynamics in a two-dimensional ideal gas of macroscopic chaotic Faraday waves.

    PubMed

    Welch, Kyle J; Hastings-Hauss, Isaac; Parthasarathy, Raghuveer; Corwin, Eric I

    2014-04-01

    We have constructed a macroscopic driven system of chaotic Faraday waves whose statistical mechanics, we find, are surprisingly simple, mimicking those of a thermal gas. We use real-time tracking of a single floating probe, energy equipartition, and the Stokes-Einstein relation to define and measure a pseudotemperature and diffusion constant and then self-consistently determine a coefficient of viscous friction for a test particle in this pseudothermal gas. Because of its simplicity, this system can serve as a model for direct experimental investigation of nonequilibrium statistical mechanics, much as the ideal gas epitomizes equilibrium statistical mechanics.

  11. Comparative Research Productivity Measures for Economic Departments.

    ERIC Educational Resources Information Center

    Huettner, David A.; Clark, William

    1997-01-01

    Develops a simple theoretical model to evaluate interdisciplinary differences in research productivity between economics departments and related subjects. Compares the research publishing statistics of economics, finance, psychology, geology, physics, oceanography, chemistry, and geophysics. Considers a number of factors including journal…

  12. Statistical prediction of September Arctic Sea Ice minimum based on stable teleconnections with global climate and oceanic patterns

    NASA Astrophysics Data System (ADS)

    Ionita, M.; Grosfeld, K.; Scholz, P.; Lohmann, G.

    2016-12-01

    Sea ice in both Polar Regions is an important indicator for the expression of global climate change and its polar amplification. Consequently, a broad information interest exists on sea ice, its coverage, variability and long term change. Knowledge on sea ice requires high quality data on ice extent, thickness and its dynamics. However, its predictability depends on various climate parameters and conditions. In order to provide insights into the potential development of a monthly/seasonal signal, we developed a robust statistical model based on ocean heat content, sea surface temperature and atmospheric variables to calculate an estimate of the September minimum sea ice extent for every year. Although previous statistical attempts at monthly/seasonal forecasts of September sea ice minimum show a relatively reduced skill, here it is shown that more than 97% (r = 0.98) of the September sea ice extent can predicted three months in advance by using previous months conditions via a multiple linear regression model based on global sea surface temperature (SST), mean sea level pressure (SLP), air temperature at 850hPa (TT850), surface winds and sea ice extent persistence. The statistical model is based on the identification of regions with stable teleconnections between the predictors (climatological parameters) and the predictand (here sea ice extent). The results based on our statistical model contribute to the sea ice prediction network for the sea ice outlook report (https://www.arcus.org/sipn) and could provide a tool for identifying relevant regions and climate parameters that are important for the sea ice development in the Arctic and for detecting sensitive and critical regions in global coupled climate models with focus on sea ice formation.

  13. Estimation of social value of statistical life using willingness-to-pay method in Nanjing, China.

    PubMed

    Yang, Zhao; Liu, Pan; Xu, Xin

    2016-10-01

    Rational decision making regarding the safety related investment programs greatly depends on the economic valuation of traffic crashes. The primary objective of this study was to estimate the social value of statistical life in the city of Nanjing in China. A stated preference survey was conducted to investigate travelers' willingness to pay for traffic risk reduction. Face-to-face interviews were conducted at stations, shopping centers, schools, and parks in different districts in the urban area of Nanjing. The respondents were categorized into two groups, including motorists and non-motorists. Both the binary logit model and mixed logit model were developed for the two groups of people. The results revealed that the mixed logit model is superior to the fixed coefficient binary logit model. The factors that significantly affect people's willingness to pay for risk reduction include income, education, gender, age, drive age (for motorists), occupation, whether the charged fees were used to improve private vehicle equipment (for motorists), reduction in fatality rate, and change in travel cost. The Monte Carlo simulation method was used to generate the distribution of value of statistical life (VSL). Based on the mixed logit model, the VSL had a mean value of 3,729,493 RMB ($586,610) with a standard deviation of 2,181,592 RMB ($343,142) for motorists; and a mean of 3,281,283 RMB ($505,318) with a standard deviation of 2,376,975 RMB ($366,054) for non-motorists. Using the tax system to illustrate the contribution of different income groups to social funds, the social value of statistical life was estimated. The average social value of statistical life was found to be 7,184,406 RMB ($1,130,032). Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Statistical Downscaling of General Circulation Model Outputs to Precipitation Accounting for Non-Stationarities in Predictor-Predictand Relationships

    PubMed Central

    Sachindra, D. A.; Perera, B. J. C.

    2016-01-01

    This paper presents a novel approach to incorporate the non-stationarities characterised in the GCM outputs, into the Predictor-Predictand Relationships (PPRs) in statistical downscaling models. In this approach, a series of 42 PPRs based on multi-linear regression (MLR) technique were determined for each calendar month using a 20-year moving window moved at a 1-year time step on the predictor data obtained from the NCEP/NCAR reanalysis data archive and observations of precipitation at 3 stations located in Victoria, Australia, for the period 1950–2010. Then the relationships between the constants and coefficients in the PPRs and the statistics of reanalysis data of predictors were determined for the period 1950–2010, for each calendar month. Thereafter, using these relationships with the statistics of the past data of HadCM3 GCM pertaining to the predictors, new PPRs were derived for the periods 1950–69, 1970–89 and 1990–99 for each station. This process yielded a non-stationary downscaling model consisting of a PPR per calendar month for each of the above three periods for each station. The non-stationarities in the climate are characterised by the long-term changes in the statistics of the climate variables and above process enabled relating the non-stationarities in the climate to the PPRs. These new PPRs were then used with the past data of HadCM3, to reproduce the observed precipitation. It was found that the non-stationary MLR based downscaling model was able to produce more accurate simulations of observed precipitation more often than conventional stationary downscaling models developed with MLR and Genetic Programming (GP). PMID:27997609

  15. Statistical Downscaling of General Circulation Model Outputs to Precipitation Accounting for Non-Stationarities in Predictor-Predictand Relationships.

    PubMed

    Sachindra, D A; Perera, B J C

    2016-01-01

    This paper presents a novel approach to incorporate the non-stationarities characterised in the GCM outputs, into the Predictor-Predictand Relationships (PPRs) in statistical downscaling models. In this approach, a series of 42 PPRs based on multi-linear regression (MLR) technique were determined for each calendar month using a 20-year moving window moved at a 1-year time step on the predictor data obtained from the NCEP/NCAR reanalysis data archive and observations of precipitation at 3 stations located in Victoria, Australia, for the period 1950-2010. Then the relationships between the constants and coefficients in the PPRs and the statistics of reanalysis data of predictors were determined for the period 1950-2010, for each calendar month. Thereafter, using these relationships with the statistics of the past data of HadCM3 GCM pertaining to the predictors, new PPRs were derived for the periods 1950-69, 1970-89 and 1990-99 for each station. This process yielded a non-stationary downscaling model consisting of a PPR per calendar month for each of the above three periods for each station. The non-stationarities in the climate are characterised by the long-term changes in the statistics of the climate variables and above process enabled relating the non-stationarities in the climate to the PPRs. These new PPRs were then used with the past data of HadCM3, to reproduce the observed precipitation. It was found that the non-stationary MLR based downscaling model was able to produce more accurate simulations of observed precipitation more often than conventional stationary downscaling models developed with MLR and Genetic Programming (GP).

  16. Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

    NASA Astrophysics Data System (ADS)

    Hoffman, A.; Forest, C. E.; Kemanian, A.

    2016-12-01

    A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.

  17. Alternative approaches to predicting methane emissions from dairy cows.

    PubMed

    Mills, J A N; Kebreab, E; Yates, C M; Crompton, L A; Cammell, S B; Dhanoa, M S; Agnew, R E; France, J

    2003-12-01

    Previous attempts to apply statistical models, which correlate nutrient intake with methane production, have been of limited value where predictions are obtained for nutrient intakes and diet types outside those used in model construction. Dynamic mechanistic models have proved more suitable for extrapolation, but they remain computationally expensive and are not applied easily in practical situations. The first objective of this research focused on employing conventional techniques to generate statistical models of methane production appropriate to United Kingdom dairy systems. The second objective was to evaluate these models and a model published previously using both United Kingdom and North American data sets. Thirdly, nonlinear models were considered as alternatives to the conventional linear regressions. The United Kingdom calorimetry data used to construct the linear models also were used to develop the three nonlinear alternatives that were all of modified Mitscherlich (monomolecular) form. Of the linear models tested, an equation from the literature proved most reliable across the full range of evaluation data (root mean square prediction error = 21.3%). However, the Mitscherlich models demonstrated the greatest degree of adaptability across diet types and intake level. The most successful model for simulating the independent data was a modified Mitscherlich equation with the steepness parameter set to represent dietary starch-to-ADF ratio (root mean square prediction error = 20.6%). However, when such data were unavailable, simpler Mitscherlich forms relating dry matter or metabolizable energy intake to methane production remained better alternatives relative to their linear counterparts.

  18. Using statistical and machine learning to help institutions detect suspicious access to electronic health records.

    PubMed

    Boxwala, Aziz A; Kim, Jihoon; Grillo, Janice M; Ohno-Machado, Lucila

    2011-01-01

    To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs. From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures. The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM. The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction. The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs.

  19. Using statistical and machine learning to help institutions detect suspicious access to electronic health records

    PubMed Central

    Kim, Jihoon; Grillo, Janice M; Ohno-Machado, Lucila

    2011-01-01

    Objective To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs. Methods From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures. Results The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM. Limitations The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction. Conclusion The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs. PMID:21672912

  20. Evaluation of different models to estimate the global solar radiation on inclined surface

    NASA Astrophysics Data System (ADS)

    Demain, C.; Journée, M.; Bertrand, C.

    2012-04-01

    Global and diffuse solar radiation intensities are, in general, measured on horizontal surfaces, whereas stationary solar conversion systems (both flat plate solar collector and solar photovoltaic) are mounted on inclined surface to maximize the amount of solar radiation incident on the collector surface. Consequently, the solar radiation incident measured on a tilted surface has to be determined by converting solar radiation from horizontal surface to tilted surface of interest. This study evaluates the performance of 14 models transposing 10 minutes, hourly and daily diffuse solar irradiation from horizontal to inclined surface. Solar radiation data from 8 months (April to November 2011) which include diverse atmospheric conditions and solar altitudes, measured on the roof of the radiation tower of the Royal Meteorological Institute of Belgium in Uccle (Longitude 4.35°, Latitude 50.79°) were used for validation purposes. The individual model performance is assessed by an inter-comparison between the calculated and measured solar global radiation on the south-oriented surface tilted at 50.79° using statistical methods. The relative performance of the different models under different sky conditions has been studied. Comparison of the statistical errors between the different radiation models in function of the clearness index shows that some models perform better under one type of sky condition. Putting together different models acting under different sky conditions can lead to a diminution of the statistical error between global measured solar radiation and global estimated solar radiation. As models described in this paper have been developed for hourly data inputs, statistical error indexes are minimum for hourly data and increase for 10 minutes and one day frequency data.

  1. Stochastic Model for Phonemes Uncovers an Author-Dependency of Their Usage.

    PubMed

    Deng, Weibing; Allahverdyan, Armen E

    2016-01-01

    We study rank-frequency relations for phonemes, the minimal units that still relate to linguistic meaning. We show that these relations can be described by the Dirichlet distribution, a direct analogue of the ideal-gas model in statistical mechanics. This description allows us to demonstrate that the rank-frequency relations for phonemes of a text do depend on its author. The author-dependency effect is not caused by the author's vocabulary (common words used in different texts), and is confirmed by several alternative means. This suggests that it can be directly related to phonemes. These features contrast to rank-frequency relations for words, which are both author and text independent and are governed by the Zipf's law.

  2. An Alternative Approach to Analyze Ipsative Data. Revisiting Experiential Learning Theory.

    PubMed

    Batista-Foguet, Joan M; Ferrer-Rosell, Berta; Serlavós, Ricard; Coenders, Germà; Boyatzis, Richard E

    2015-01-01

    The ritualistic use of statistical models regardless of the type of data actually available is a common practice across disciplines which we dare to call type zero error. Statistical models involve a series of assumptions whose existence is often neglected altogether, this is specially the case with ipsative data. This paper illustrates the consequences of this ritualistic practice within Kolb's Experiential Learning Theory (ELT) operationalized through its Learning Style Inventory (KLSI). We show how using a well-known methodology in other disciplines-compositional data analysis (CODA) and log ratio transformations-KLSI data can be properly analyzed. In addition, the method has theoretical implications: a third dimension of the KLSI is unveiled providing room for future research. This third dimension describes an individual's relative preference for learning by prehension rather than by transformation. Using a sample of international MBA students, we relate this dimension with another self-assessment instrument, the Philosophical Orientation Questionnaire (POQ), and with an observer-assessed instrument, the Emotional and Social Competency Inventory (ESCI-U). Both show plausible statistical relationships. An intellectual operating philosophy (IOP) is linked to a preference for prehension, whereas a pragmatic operating philosophy (POP) is linked to transformation. Self-management and social awareness competencies are linked to a learning preference for transforming knowledge, whereas relationship management and cognitive competencies are more related to approaching learning by prehension.

  3. An Alternative Approach to Analyze Ipsative Data. Revisiting Experiential Learning Theory

    PubMed Central

    Batista-Foguet, Joan M.; Ferrer-Rosell, Berta; Serlavós, Ricard; Coenders, Germà; Boyatzis, Richard E.

    2015-01-01

    The ritualistic use of statistical models regardless of the type of data actually available is a common practice across disciplines which we dare to call type zero error. Statistical models involve a series of assumptions whose existence is often neglected altogether, this is specially the case with ipsative data. This paper illustrates the consequences of this ritualistic practice within Kolb's Experiential Learning Theory (ELT) operationalized through its Learning Style Inventory (KLSI). We show how using a well-known methodology in other disciplines—compositional data analysis (CODA) and log ratio transformations—KLSI data can be properly analyzed. In addition, the method has theoretical implications: a third dimension of the KLSI is unveiled providing room for future research. This third dimension describes an individual's relative preference for learning by prehension rather than by transformation. Using a sample of international MBA students, we relate this dimension with another self-assessment instrument, the Philosophical Orientation Questionnaire (POQ), and with an observer-assessed instrument, the Emotional and Social Competency Inventory (ESCI-U). Both show plausible statistical relationships. An intellectual operating philosophy (IOP) is linked to a preference for prehension, whereas a pragmatic operating philosophy (POP) is linked to transformation. Self-management and social awareness competencies are linked to a learning preference for transforming knowledge, whereas relationship management and cognitive competencies are more related to approaching learning by prehension. PMID:26617561

  4. Modeling of the dielectric permittivity of porous soil media with water using statistical-physical models

    NASA Astrophysics Data System (ADS)

    Usowicz, Boguslaw; Marczewski, Wojciech; Usowicz, Jerzy B.; Łukowski, Mateusz; Lipiec, Jerzy; Stankiewicz, Krystyna

    2013-04-01

    Radiometric observations with SMOS rely on the Radiation Transfer Equations (RTE) determining the Brightness Temperature (BT) in two linear polarization components (H, V) satisfying Fresnel principle of propagation in horizontally layered target media on the ground. RTE involve variables which bound the equations expressed in Electro-Magnetic (EM) terms of the intensity BT to the physical reality expressed by non-EM variables (Soil Moisture (SM), vegetation indexes, fractional coverage with many different properties, and the boundary conditions like optical thickness, layer definitions, roughness, etc.) bridging the EM domain to other physical aspects by means of the so called tau-omega methods. This method enables joining variety of different valuable models, including specific empirical estimation of physical properties in relation to the volumetric water content. The equations of RTE are in fact expressed by propagation, reflection and losses or attenuation existing on a considered propagation path. The electromagnetic propagation is expressed in the propagation constant. For target media on the ground the dielectric constant is a decisive part for effects of propagation. Therefore, despite of many various physical parameters involved, one must effectively and dominantly rely on the dielectric constant meant as a complex variable. The real part of the dielectric constant represents effect of apparent shortening the propagation path and the refraction, while the imaginary part is responsible for the attenuation or losses. This work engages statistical-physical modeling of soil properties considering the media as a mixture of solid grains, and gas or liquid filling of pores and contact bridges between compounds treated statistically. The method of this modeling provides an opportunity of characterizing the porosity by general statistical means, and is applicable to various physical properties (thermal, electrical conductivity and dielectric properties) which depend on composition of compounds. The method was developed beyond the SMOS method, but they meet just in RTE, at the dielectric constant. The dielectric constant is observed or measured (retrieved) by SMOS, regardless other properties like the soil porosity and without a direct relation to thermal properties of soils. Relations between thermal properties of soil to the water content are very consistent. Therefore, we took a concept of introducing effects of the soil porosity, and thermal properties of soils into the representation of the dielectric constant in complex measures, and thus gaining new abilities for capturing effects of the porosity by the method of SMOS observations. Currently we are able presenting few effects of relations between thermal properties and the soil moisture content, on examples from wetlands Biebrza and Polesie in Poland, and only search for correlations between SM from SMOS to the moisture content known from the ground. The correlations are poor for SMOS L2 data processed with the version of retrievals using the model of Dobson (501), but we expect more correlation for the version using the model of Mironov (551). If the supposition is confirmed, then we may gain encouragement to employing the statistical-physical modeling of the dielectric constant and thermal properties for the purposes of using this model in RTE and tau-omega method. Treating the soil porosity for a target of research directly is not enough strongly motivated like the use of effects on SM observable in SMOS.

  5. Mixed Effects Models for Resampled Network Statistics Improves Statistical Power to Find Differences in Multi-Subject Functional Connectivity

    PubMed Central

    Narayan, Manjari; Allen, Genevera I.

    2016-01-01

    Many complex brain disorders, such as autism spectrum disorders, exhibit a wide range of symptoms and disability. To understand how brain communication is impaired in such conditions, functional connectivity studies seek to understand individual differences in brain network structure in terms of covariates that measure symptom severity. In practice, however, functional connectivity is not observed but estimated from complex and noisy neural activity measurements. Imperfect subject network estimates can compromise subsequent efforts to detect covariate effects on network structure. We address this problem in the case of Gaussian graphical models of functional connectivity, by proposing novel two-level models that treat both subject level networks and population level covariate effects as unknown parameters. To account for imperfectly estimated subject level networks when fitting these models, we propose two related approaches—R2 based on resampling and random effects test statistics, and R3 that additionally employs random adaptive penalization. Simulation studies using realistic graph structures reveal that R2 and R3 have superior statistical power to detect covariate effects compared to existing approaches, particularly when the number of within subject observations is comparable to the size of subject networks. Using our novel models and methods to study parts of the ABIDE dataset, we find evidence of hypoconnectivity associated with symptom severity in autism spectrum disorders, in frontoparietal and limbic systems as well as in anterior and posterior cingulate cortices. PMID:27147940

  6. Applying the compound Poisson process model to the reporting of injury-related mortality rates.

    PubMed

    Kegler, Scott R

    2007-02-16

    Injury-related mortality rate estimates are often analyzed under the assumption that case counts follow a Poisson distribution. Certain types of injury incidents occasionally involve multiple fatalities, however, resulting in dependencies between cases that are not reflected in the simple Poisson model and which can affect even basic statistical analyses. This paper explores the compound Poisson process model as an alternative, emphasizing adjustments to some commonly used interval estimators for population-based rates and rate ratios. The adjusted estimators involve relatively simple closed-form computations, which in the absence of multiple-case incidents reduce to familiar estimators based on the simpler Poisson model. Summary data from the National Violent Death Reporting System are referenced in several examples demonstrating application of the proposed methodology.

  7. Machine Learning Biogeographic Processes from Biotic Patterns: A New Trait-Dependent Dispersal and Diversification Model with Model Choice By Simulation-Trained Discriminant Analysis.

    PubMed

    Sukumaran, Jeet; Economo, Evan P; Lacey Knowles, L

    2016-05-01

    Current statistical biogeographical analysis methods are limited in the ways ecology can be related to the processes of diversification and geographical range evolution, requiring conflation of geography and ecology, and/or assuming ecologies that are uniform across all lineages and invariant in time. This precludes the possibility of studying a broad class of macroevolutionary biogeographical theories that relate geographical and species histories through lineage-specific ecological and evolutionary dynamics, such as taxon cycle theory. Here we present a new model that generates phylogenies under a complex of superpositioned geographical range evolution, trait evolution, and diversification processes that can communicate with each other. We present a likelihood-free method of inference under our model using discriminant analysis of principal components of summary statistics calculated on phylogenies, with the discriminant functions trained on data generated by simulations under our model. This approach of model selection by classification of empirical data with respect to data generated under training models is shown to be efficient, robust, and performs well over a broad range of parameter space defined by the relative rates of dispersal, trait evolution, and diversification processes. We apply our method to a case study of the taxon cycle, that is testing for habitat and trophic level constraints in the dispersal regimes of the Wallacean avifaunal radiation. ©The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable

    PubMed Central

    2012-01-01

    Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998

  9. Preventive Effect of Phosphodiesterase Inhibitor Pentoxifylline Against Medication-Related Osteonecrosis of the Jaw: An Animal Study.

    PubMed

    Yalcin-Ulker, Gül Merve; Cumbul, Alev; Duygu-Capar, Gonca; Uslu, Ünal; Sencift, Kemal

    2017-11-01

    The aim of this experimental study was to investigate the prophylactic effect of pentoxifylline (PTX) on medication-related osteonecrosis of the jaw (MRONJ). Female Sprague-Dawley rats (n = 33) received zoledronic acid (ZA) for 8 weeks to create an osteonecrosis model. The left mandibular second molars were extracted and the recovery period lasted 8 weeks before sacrifice. PTX was intraperitoneally administered to prevent MRONJ. The specimens were histopathologically and histomorphometrically evaluated. Histomorphometrically, between the control and ZA groups, there was no statistically significant difference in total bone volume (P = .999), but there was a statistically significant difference in bone ratio in the extraction sockets (P < .001). A comparison of the bone ratio of the ZA group with the ZA/PTX group (PTX administered after extraction) showed no statistically significant difference (P = .69), but there was a statistically significant difference with the ZA/PTX/PTX group (PTX administered before and after extraction; P = .008). Histopathologically, between the control and ZA groups, there were statistically significant differences for inflammation (P = .013), vascularization (P = .022), hemorrhage (P = .025), and regeneration (P = .008). Between the ZA and ZA/PTX groups, there were no statistically significant differences for inflammation (P = .536), vascularization (P = .642), hemorrhage (P = .765), and regeneration (P = .127). Between the ZA and ZA/PTX/PTX groups, there were statistically significant differences for inflammation (P = .017), vascularization (P = .04), hemorrhage (P = .044), and regeneration (P = .04). In this experimental model of MRONJ, it might be concluded that although PTX, given after tooth extraction, improves new bone formation that positively affects bone healing, it is not prophylactic. However, PTX given before tooth extraction is prophylactic. Therefore, PTX might affect healing in a positive way by optimizing the inflammatory response. Copyright © 2017 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  10. A semiempirical linear model of indirect, flat-panel x-ray detectors.

    PubMed

    Huang, Shih-Ying; Yang, Kai; Abbey, Craig K; Boone, John M

    2012-04-01

    It is important to understand signal and noise transfer in the indirect, flat-panel x-ray detector when developing and optimizing imaging systems. For optimization where simulating images is necessary, this study introduces a semiempirical model to simulate projection images with user-defined x-ray fluence interaction. The signal and noise transfer in the indirect, flat-panel x-ray detectors is characterized by statistics consistent with energy-integration of x-ray photons. For an incident x-ray spectrum, x-ray photons are attenuated and absorbed in the x-ray scintillator to produce light photons, which are coupled to photodiodes for signal readout. The signal mean and variance are linearly related to the energy-integrated x-ray spectrum by empirically determined factors. With the known first- and second-order statistics, images can be simulated by incorporating multipixel signal statistics and the modulation transfer function of the imaging system. To estimate the semiempirical input to this model, 500 projection images (using an indirect, flat-panel x-ray detector in the breast CT system) were acquired with 50-100 kilovolt (kV) x-ray spectra filtered with 0.1-mm tin (Sn), 0.2-mm copper (Cu), 1.5-mm aluminum (Al), or 0.05-mm silver (Ag). The signal mean and variance of each detector element and the noise power spectra (NPS) were calculated and incorporated into this model for accuracy. Additionally, the modulation transfer function of the detector system was physically measured and incorporated in the image simulation steps. For validation purposes, simulated and measured projection images of air scans were compared using 40 kV∕0.1-mm Sn, 65 kV∕0.2-mm Cu, 85 kV∕1.5-mm Al, and 95 kV∕0.05-mm Ag. The linear relationship between the measured signal statistics and the energy-integrated x-ray spectrum was confirmed and incorporated into the model. The signal mean and variance factors were linearly related to kV for each filter material (r(2) of signal mean to kV: 0.91, 0.93, 0.86, and 0.99 for 0.1-mm Sn, 0.2-mm Cu, 1.5-mm Al, and 0.05-mm Ag, respectively; r(2) of signal variance to kV: 0.99 for all four filters). The comparison of the signal and noise (mean, variance, and NPS) between the simulated and measured air scan images suggested that this model was reasonable in predicting accurate signal statistics of air scan images using absolute percent error. Overall, the model was found to be accurate in estimating signal statistics and spatial correlation between the detector elements of the images acquired with indirect, flat-panel x-ray detectors. The semiempirical linear model of the indirect, flat-panel x-ray detectors was described and validated with images of air scans. The model was found to be a useful tool in understanding the signal and noise transfer within indirect, flat-panel x-ray detector systems.

  11. Statistical Physics on the Eve of the 21st Century: in Honour of J B McGuire on the Occasion of His 65th Birthday

    NASA Astrophysics Data System (ADS)

    Batchelor, Murray T.; Wille, Luc T.

    The Table of Contents for the book is as follows: * Preface * Modelling the Immune System - An Example of the Simulation of Complex Biological Systems * Brief Overview of Quantum Computation * Quantal Information in Statistical Physics * Modeling Economic Randomness: Statistical Mechanics of Market Phenomena * Essentially Singular Solutions of Feigenbaum- Type Functional Equations * Spatiotemporal Chaotic Dynamics in Coupled Map Lattices * Approach to Equilibrium of Chaotic Systems * From Level to Level in Brain and Behavior * Linear and Entropic Transformations of the Hydrophobic Free Energy Sequence Help Characterize a Novel Brain Polyprotein: CART's Protein * Dynamical Systems Response to Pulsed High-Frequency Fields * Bose-Einstein Condensates in the Light of Nonlinear Physics * Markov Superposition Expansion for the Entropy and Correlation Functions in Two and Three Dimensions * Calculation of Wave Center Deflection and Multifractal Analysis of Directed Waves Through the Study of su(1,1)Ferromagnets * Spectral Properties and Phases in Hierarchical Master Equations * Universality of the Distribution Functions of Random Matrix Theory * The Universal Chiral Partition Function for Exclusion Statistics * Continuous Space-Time Symmetries in a Lattice Field Theory * Quelques Cas Limites du Problème à N Corps Unidimensionnel * Integrable Models of Correlated Electrons * On the Riemann Surface of the Three-State Chiral Potts Model * Two Exactly Soluble Lattice Models in Three Dimensions * Competition of Ferromagnetic and Antiferromagnetic Order in the Spin-l/2 XXZ Chain at Finite Temperature * Extended Vertex Operator Algebras and Monomial Bases * Parity and Charge Conjugation Symmetries and S Matrix of the XXZ Chain * An Exactly Solvable Constrained XXZ Chain * Integrable Mixed Vertex Models Ftom the Braid-Monoid Algebra * From Yang-Baxter Equations to Dynamical Zeta Functions for Birational Tlansformations * Hexagonal Lattice Directed Site Animals * Direction in the Star-Triangle Relations * A Self-Avoiding Walk Through Exactly Solved Lattice Models in Statistical Mechanics

  12. Nowcasting sunshine number using logistic modeling

    NASA Astrophysics Data System (ADS)

    Brabec, Marek; Badescu, Viorel; Paulescu, Marius

    2013-04-01

    In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.

  13. Statistical Mechanics of the US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward D.; Broedersz, Chase P.; Bialek, William

    2015-07-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The maximum entropy model consistent with the observed pairwise correlations among justices' votes, an Ising spin glass, agrees quantitatively with the data. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering the intuition that ideologically opposite justices negatively influence each another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, organizing the voting patterns in a relatively simple "energy landscape." Besides unanimity, other energy minima in this landscape, or maxima in probability, correspond to prototypical voting states, such as the ideological split or a tightly correlated, conservative core. The model correctly predicts the correlation of justices with the majority and gives us a measure of their influence on the majority decision. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context.

  14. POOLMS: A computer program for fitting and model selection for two level factorial replication-free experiments

    NASA Technical Reports Server (NTRS)

    Amling, G. E.; Holms, A. G.

    1973-01-01

    A computer program is described that performs a statistical multiple-decision procedure called chain pooling. It uses a number of mean squares assigned to error variance that is conditioned on the relative magnitudes of the mean squares. The model selection is done according to user-specified levels of type 1 or type 2 error probabilities.

  15. A re-evaluation of a case-control model with contaminated controls for resource selection studies

    Treesearch

    Christopher T. Rota; Joshua J. Millspaugh; Dylan C. Kesler; Chad P. Lehman; Mark A. Rumble; Catherine M. B. Jachowski

    2013-01-01

    A common sampling design in resource selection studies involves measuring resource attributes at sample units used by an animal and at sample units considered available for use. Few models can estimate the absolute probability of using a sample unit from such data, but such approaches are generally preferred over statistical methods that estimate a relative probability...

  16. A state-space modeling approach to estimating canopy conductance and associated uncertainties from sap flux density data

    Treesearch

    David M. Bell; Eric J. Ward; A. Christopher Oishi; Ram Oren; Paul G. Flikkema; James S. Clark; David Whitehead

    2015-01-01

    Uncertainties in ecophysiological responses to environment, such as the impact of atmospheric and soil moisture conditions on plant water regulation, limit our ability to estimate key inputs for ecosystem models. Advanced statistical frameworks provide coherent methodologies for relating observed data, such as stem sap flux density, to unobserved processes, such as...

  17. Exact Local Correlations and Full Counting Statistics for Arbitrary States of the One-Dimensional Interacting Bose Gas

    NASA Astrophysics Data System (ADS)

    Bastianello, Alvise; Piroli, Lorenzo; Calabrese, Pasquale

    2018-05-01

    We derive exact analytic expressions for the n -body local correlations in the one-dimensional Bose gas with contact repulsive interactions (Lieb-Liniger model) in the thermodynamic limit. Our results are valid for arbitrary states of the model, including ground and thermal states, stationary states after a quantum quench, and nonequilibrium steady states arising in transport settings. Calculations for these states are explicitly presented and physical consequences are critically discussed. We also show that the n -body local correlations are directly related to the full counting statistics for the particle-number fluctuations in a short interval, for which we provide an explicit analytic result.

  18. Polymer models of interphase chromosomes

    PubMed Central

    Vasquez, Paula A; Bloom, Kerry

    2014-01-01

    Clear organizational patterns on the genome have emerged from the statistics of population studies of fixed cells. However, how these results translate into the dynamics of individual living cells remains unexplored. We use statistical mechanics models derived from polymer physics to inquire into the effects that chromosome properties and dynamics have in the temporal and spatial behavior of the genome. Overall, changes in the properties of individual chains affect the behavior of all other chains in the domain. We explore two modifications of chain behavior: single chain motion and chain-chain interactions. We show that there is not a direct relation between these effects, as increase in motion, doesn’t necessarily translate into an increase on chain interaction. PMID:25482191

  19. Binary recursive partitioning: background, methods, and application to psychology.

    PubMed

    Merkle, Edgar C; Shaffer, Victoria A

    2011-02-01

    Binary recursive partitioning (BRP) is a computationally intensive statistical method that can be used in situations where linear models are often used. Instead of imposing many assumptions to arrive at a tractable statistical model, BRP simply seeks to accurately predict a response variable based on values of predictor variables. The method outputs a decision tree depicting the predictor variables that were related to the response variable, along with the nature of the variables' relationships. No significance tests are involved, and the tree's 'goodness' is judged based on its predictive accuracy. In this paper, we describe BRP methods in a detailed manner and illustrate their use in psychological research. We also provide R code for carrying out the methods.

  20. GPU-computing in econophysics and statistical physics

    NASA Astrophysics Data System (ADS)

    Preis, T.

    2011-03-01

    A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.

  1. Development of a statistical model for cervical cancer cell death with irreversible electroporation in vitro.

    PubMed

    Yang, Yongji; Moser, Michael A J; Zhang, Edwin; Zhang, Wenjun; Zhang, Bing

    2018-01-01

    The aim of this study was to develop a statistical model for cell death by irreversible electroporation (IRE) and to show that the statistic model is more accurate than the electric field threshold model in the literature using cervical cancer cells in vitro. HeLa cell line was cultured and treated with different IRE protocols in order to obtain data for modeling the statistical relationship between the cell death and pulse-setting parameters. In total, 340 in vitro experiments were performed with a commercial IRE pulse system, including a pulse generator and an electric cuvette. Trypan blue staining technique was used to evaluate cell death after 4 hours of incubation following IRE treatment. Peleg-Fermi model was used in the study to build the statistical relationship using the cell viability data obtained from the in vitro experiments. A finite element model of IRE for the electric field distribution was also built. Comparison of ablation zones between the statistical model and electric threshold model (drawn from the finite element model) was used to show the accuracy of the proposed statistical model in the description of the ablation zone and its applicability in different pulse-setting parameters. The statistical models describing the relationships between HeLa cell death and pulse length and the number of pulses, respectively, were built. The values of the curve fitting parameters were obtained using the Peleg-Fermi model for the treatment of cervical cancer with IRE. The difference in the ablation zone between the statistical model and the electric threshold model was also illustrated to show the accuracy of the proposed statistical model in the representation of ablation zone in IRE. This study concluded that: (1) the proposed statistical model accurately described the ablation zone of IRE with cervical cancer cells, and was more accurate compared with the electric field model; (2) the proposed statistical model was able to estimate the value of electric field threshold for the computer simulation of IRE in the treatment of cervical cancer; and (3) the proposed statistical model was able to express the change in ablation zone with the change in pulse-setting parameters.

  2. Quantum Treatment of Two Coupled Oscillators in Interaction with a Two-Level Atom:

    NASA Astrophysics Data System (ADS)

    Khalil, E. M.; Abdalla, M. Sebawe; Obada, A. S.-F.

    In this communication we handle a modified model representing the interaction between a two-level atom and two modes of the electromagnetic field in a cavity. The interaction between the modes is assumed to be of a parametric amplifier type. The model consists of two different systems, one represents the Jaynes-Cummings model (atom-field interaction) and the other represents the two mode parametric amplifier model (field-field interaction). After some canonical transformations the constants of the motion have been obtained and used to derive the time evolution operator. The wave function in the Schrödinger picture is constructed and employed to discuss some statistical properties related to the system. Further discussion related to the statistical properties of some physical quantities is given where we have taken into account an initial correlated pair-coherent state for the modes. We concentrate in our examination on the system behavior that occurred as a result of the variation of the parametric amplifier coupling parameter as well as the detuning parameter. It has been shown that the interaction of the parametric amplifier term increases the revival period and consequently longer period of strong interaction between the atom and the fields.

  3. Trends and associated uncertainty in the global mean temperature record

    NASA Astrophysics Data System (ADS)

    Poppick, A. N.; Moyer, E. J.; Stein, M.

    2016-12-01

    Physical models suggest that the Earth's mean temperature warms in response to changing CO2 concentrations (and hence increased radiative forcing); given physical uncertainties in this relationship, the historical temperature record is a source of empirical information about global warming. A persistent thread in many analyses of the historical temperature record, however, is the reliance on methods that appear to deemphasize both physical and statistical assumptions. Examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for natural variability in nonparametric rather than parametric ways. We show here that methods that deemphasize assumptions can limit the scope of analysis and can lead to misleading inferences, particularly in the setting considered where the data record is relatively short and the scale of temporal correlation is relatively long. A proposed model that is simple but physically informed provides a more reliable estimate of trends and allows a broader array of questions to be addressed. In accounting for uncertainty, we also illustrate how parametric statistical models that are attuned to the important characteristics of natural variability can be more reliable than ostensibly more flexible approaches.

  4. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts.

    PubMed

    Preisser, John S; Long, D Leann; Stamm, John W

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. © 2017 S. Karger AG, Basel.

  5. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts

    PubMed Central

    Preisser, John S.; Long, D. Leann; Stamm, John W.

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two datasets, one consisting of fictional dmft counts in two groups and the other on DMFS among schoolchildren from a randomized clinical trial (RCT) comparing three toothpaste formulations to prevent incident dental caries, are analysed with negative binomial hurdle (NBH), zero-inflated negative binomial (ZINB), and marginalized zero-inflated negative binomial (MZINB) models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the RCT were similar despite their distinctive interpretations. Choice of statistical model class should match the study’s purpose, while accounting for the broad decline in children’s caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. PMID:28291962

  6. Statistical and dynamical forecast of regional precipitation after mature phase of ENSO

    NASA Astrophysics Data System (ADS)

    Sohn, S.; Min, Y.; Lee, J.; Tam, C.; Ahn, J.

    2010-12-01

    While the seasonal predictability of general circulation models (GCMs) has been improved, the current model atmosphere in the mid-latitude does not respond correctly to external forcing such as tropical sea surface temperature (SST), particularly over the East Asia and western North Pacific summer monsoon regions. In addition, the time-scale of prediction scope is considerably limited and the model forecast skill still is very poor beyond two weeks. Although recent studies indicate that coupled model based multi-model ensemble (MME) forecasts show the better performance, the long-lead forecasts exceeding 9 months still show a dramatic decrease of the seasonal predictability. This study aims at diagnosing the dynamical MME forecasts comprised of the state of art 1-tier models as well as comparing them with the statistical model forecasts, focusing on the East Asian summer precipitation predictions after mature phase of ENSO. The lagged impact of El Nino as major climate contributor on the summer monsoon in model environments is also evaluated, in the sense of the conditional probabilities. To evaluate the probability forecast skills, the reliability (attributes) diagram and the relative operating characteristics following the recommendations of the World Meteorological Organization (WMO) Standardized Verification System for Long-Range Forecasts are used in this study. The results should shed light on the prediction skill for dynamical model and also for the statistical model, in forecasting the East Asian summer monsoon rainfall with a long-lead time.

  7. Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model.

    PubMed

    Hu, Jianhua; Wright, Fred A

    2007-03-01

    The identification of the genes that are differentially expressed in two-sample microarray experiments remains a difficult problem when the number of arrays is very small. We discuss the implications of using ordinary t-statistics and examine other commonly used variants. For oligonucleotide arrays with multiple probes per gene, we introduce a simple model relating the mean and variance of expression, possibly with gene-specific random effects. Parameter estimates from the model have natural shrinkage properties that guard against inappropriately small variance estimates, and the model is used to obtain a differential expression statistic. A limiting value to the positive false discovery rate (pFDR) for ordinary t-tests provides motivation for our use of the data structure to improve variance estimates. Our approach performs well compared to other proposed approaches in terms of the false discovery rate.

  8. A statistical investigation of the mass discrepancy-acceleration relation

    NASA Astrophysics Data System (ADS)

    Desmond, Harry

    2017-02-01

    We use the mass discrepancy-acceleration relation (the correlation between the ratio of total-to-visible mass and acceleration in galaxies; MDAR) to test the galaxy-halo connection. We analyse the MDAR using a set of 16 statistics that quantify its four most important features: shape, scatter, the presence of a `characteristic acceleration scale', and the correlation of its residuals with other galaxy properties. We construct an empirical framework for the galaxy-halo connection in LCDM to generate predictions for these statistics, starting with conventional correlations (halo abundance matching; AM) and introducing more where required. Comparing to the SPARC data, we find that: (1) the approximate shape of the MDAR is readily reproduced by AM, and there is no evidence that the acceleration at which dark matter becomes negligible has less spread in the data than in AM mocks; (2) even under conservative assumptions, AM significantly overpredicts the scatter in the relation and its normalization at low acceleration, and furthermore positions dark matter too close to galaxies' centres on average; (3) the MDAR affords 2σ evidence for an anticorrelation of galaxy size and Hubble type with halo mass or concentration at fixed stellar mass. Our analysis lays the groundwork for a bottom-up determination of the galaxy-halo connection from relations such as the MDAR, provides concrete statistical tests for specific galaxy formation models, and brings into sharper focus the relative evidence accorded by galaxy kinematics to LCDM and modified gravity alternatives.

  9. A statistical investigation of the mass discrepancy–acceleration relation

    DOE PAGES

    Desmond, Harry

    2016-10-08

    We use the mass discrepancy–acceleration relation (the correlation between the ratio of total-to-visible mass and acceleration in galaxies; MDAR) to test the galaxy–halo connection. Here, we analyse the MDAR using a set of 16 statistics that quantify its four most important features: shape, scatter, the presence of a ‘characteristic acceleration scale’, and the correlation of its residuals with other galaxy properties. We construct an empirical framework for the galaxy–halo connection in LCDM to generate predictions for these statistics, starting with conventional correlations (halo abundance matching; AM) and introducing more where required. Comparing to the SPARC data, we find that: (1)more » the approximate shape of the MDAR is readily reproduced by AM, and there is no evidence that the acceleration at which dark matter becomes negligible has less spread in the data than in AM mocks; (2) even under conservative assumptions, AM significantly overpredicts the scatter in the relation and its normalization at low acceleration, and furthermore positions dark matter too close to galaxies’ centres on average; (3) the MDAR affords 2σ evidence for an anticorrelation of galaxy size and Hubble type with halo mass or concentration at fixed stellar mass. Lastly, our analysis lays the groundwork for a bottom-up determination of the galaxy–halo connection from relations such as the MDAR, provides concrete statistical tests for specific galaxy formation models, and brings into sharper focus the relative evidence accorded by galaxy kinematics to LCDM and modified gravity alternatives.« less

  10. Statistical classification of drug incidents due to look-alike sound-alike mix-ups.

    PubMed

    Wong, Zoie Shui Yee

    2016-06-01

    It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.

  11. Distinguishing synchronous and time-varying synergies using point process interval statistics: motor primitives in frog and rat

    PubMed Central

    Hart, Corey B.; Giszter, Simon F.

    2013-01-01

    We present and apply a method that uses point process statistics to discriminate the forms of synergies in motor pattern data, prior to explicit synergy extraction. The method uses electromyogram (EMG) pulse peak timing or onset timing. Peak timing is preferable in complex patterns where pulse onsets may be overlapping. An interval statistic derived from the point processes of EMG peak timings distinguishes time-varying synergies from synchronous synergies (SS). Model data shows that the statistic is robust for most conditions. Its application to both frog hindlimb EMG and rat locomotion hindlimb EMG show data from these preparations is clearly most consistent with synchronous synergy models (p < 0.001). Additional direct tests of pulse and interval relations in frog data further bolster the support for synchronous synergy mechanisms in these data. Our method and analyses support separated control of rhythm and pattern of motor primitives, with the low level execution primitives comprising pulsed SS in both frog and rat, and both episodic and rhythmic behaviors. PMID:23675341

  12. Statistics, Adjusted Statistics, and Maladjusted Statistics.

    PubMed

    Kaufman, Jay S

    2017-05-01

    Statistical adjustment is a ubiquitous practice in all quantitative fields that is meant to correct for improprieties or limitations in observed data, to remove the influence of nuisance variables or to turn observed correlations into causal inferences. These adjustments proceed by reporting not what was observed in the real world, but instead modeling what would have been observed in an imaginary world in which specific nuisances and improprieties are absent. These techniques are powerful and useful inferential tools, but their application can be hazardous or deleterious if consumers of the adjusted results mistake the imaginary world of models for the real world of data. Adjustments require decisions about which factors are of primary interest and which are imagined away, and yet many adjusted results are presented without any explanation or justification for these decisions. Adjustments can be harmful if poorly motivated, and are frequently misinterpreted in the media's reporting of scientific studies. Adjustment procedures have become so routinized that many scientists and readers lose the habit of relating the reported findings back to the real world in which we live.

  13. Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup.

    PubMed

    Liu, Hongyou; Gomez, Miguel-Ángel; Lago-Peñas, Carlos; Sampaio, Jaime

    2015-01-01

    Identifying match statistics that strongly contribute to winning in football matches is a very important step towards a more predictive and prescriptive performance analysis. The current study aimed to determine relationships between 24 match statistics and the match outcome (win, loss and draw) in all games and close games of the group stage of FIFA World Cup (2014, Brazil) by employing the generalised linear model. The cumulative logistic regression was run in the model taking the value of each match statistic as independent variable to predict the logarithm of the odds of winning. Relationships were assessed as effects of a two-standard-deviation increase in the value of each variable on the change in the probability of a team winning a match. Non-clinical magnitude-based inferences were employed and were evaluated by using the smallest worthwhile change. Results showed that for all the games, nine match statistics had clearly positive effects on the probability of winning (Shot, Shot on Target, Shot from Counter Attack, Shot from Inside Area, Ball Possession, Short Pass, Average Pass Streak, Aerial Advantage and Tackle), four had clearly negative effects (Shot Blocked, Cross, Dribble and Red Card), other 12 statistics had either trivial or unclear effects. While for the close games, the effects of Aerial Advantage and Yellow Card turned to trivial and clearly negative, respectively. Information from the tactical modelling can provide a more thorough and objective match understanding to coaches and performance analysts for evaluating post-match performances and for scouting upcoming oppositions.

  14. School Collective Efficacy and Bullying Behaviour: A Multilevel Study.

    PubMed

    Olsson, Gabriella; Låftman, Sara Brolin; Modin, Bitte

    2017-12-20

    As with other forms of violent behaviour, bullying is the result of multiple influences acting on different societal levels. Yet the majority of studies on bullying focus primarily on the characteristics of individual bullies and bullied. Fewer studies have explored how the characteristics of central contexts in young people's lives are related to bullying behaviour over and above the influence of individual-level characteristics. This study explores how teacher-rated school collective efficacy is related to student-reported bullying behaviour (traditional and cyberbullying victimization and perpetration). A central focus is to explore if school collective efficacy is related similarly to both traditional bullying and cyberbullying. Analyses are based on combined information from two independent data collections conducted in 2016 among 11th grade students ( n = 6067) and teachers ( n = 1251) in 58 upper secondary schools in Stockholm. The statistical method used is multilevel modelling, estimating two-level binary logistic regression models. The results demonstrate statistically significant between-school differences in all outcomes, except traditional bullying perpetration. Strong school collective efficacy is related to less traditional bullying perpetration and less cyberbullying victimization and perpetration, indicating that collective norm regulation and school social cohesion may contribute to reducing the occurrence of bullying.

  15. School Collective Efficacy and Bullying Behaviour: A Multilevel Study

    PubMed Central

    Olsson, Gabriella; Låftman, Sara Brolin; Modin, Bitte

    2017-01-01

    As with other forms of violent behaviour, bullying is the result of multiple influences acting on different societal levels. Yet the majority of studies on bullying focus primarily on the characteristics of individual bullies and bullied. Fewer studies have explored how the characteristics of central contexts in young people’s lives are related to bullying behaviour over and above the influence of individual-level characteristics. This study explores how teacher-rated school collective efficacy is related to student-reported bullying behaviour (traditional and cyberbullying victimization and perpetration). A central focus is to explore if school collective efficacy is related similarly to both traditional bullying and cyberbullying. Analyses are based on combined information from two independent data collections conducted in 2016 among 11th grade students (n = 6067) and teachers (n = 1251) in 58 upper secondary schools in Stockholm. The statistical method used is multilevel modelling, estimating two-level binary logistic regression models. The results demonstrate statistically significant between-school differences in all outcomes, except traditional bullying perpetration. Strong school collective efficacy is related to less traditional bullying perpetration and less cyberbullying victimization and perpetration, indicating that collective norm regulation and school social cohesion may contribute to reducing the occurrence of bullying. PMID:29261114

  16. Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

    NASA Astrophysics Data System (ADS)

    Passemier, Damien; McKay, Matthew R.; Chen, Yang

    2015-07-01

    Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.

  17. Risk prediction models of breast cancer: a systematic review of model performances.

    PubMed

    Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin

    2012-05-01

    The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.

  18. Linking Mechanics and Statistics in Epidermal Tissues

    NASA Astrophysics Data System (ADS)

    Kim, Sangwoo; Hilgenfeldt, Sascha

    2015-03-01

    Disordered cellular structures, such as foams, polycrystals, or living tissues, can be characterized by quantitative measurements of domain size and topology. In recent work, we showed that correlations between size and topology in 2D systems are sensitive to the shape (eccentricity) of the individual domains: From a local model of neighbor relations, we derived an analytical justification for the famous empirical Lewis law, confirming the theory with experimental data from cucumber epidermal tissue. Here, we go beyond this purely geometrical model and identify mechanical properties of the tissue as the root cause for the domain eccentricity and thus the statistics of tissue structure. The simple model approach is based on the minimization of an interfacial energy functional. Simulations with Surface Evolver show that the domain statistics depend on a single mechanical parameter, while parameter fluctuations from cell to cell play an important role in simultaneously explaining the shape distribution of cells. The simulations are in excellent agreement with experiments and analytical theory, and establish a general link between the mechanical properties of a tissue and its structure. The model is relevant to diagnostic applications in a variety of animal and plant tissues.

  19. Universal avalanche statistics and triggering close to failure in a mean-field model of rheological fracture

    NASA Astrophysics Data System (ADS)

    Baró, Jordi; Davidsen, Jörn

    2018-03-01

    The hypothesis of critical failure relates the presence of an ultimate stability point in the structural constitutive equation of materials to a divergence of characteristic scales in the microscopic dynamics responsible for deformation. Avalanche models involving critical failure have determined common universality classes for stick-slip processes and fracture. However, not all empirical failure processes exhibit the trademarks of criticality. The rheological properties of materials introduce dissipation, usually reproduced in conceptual models as a hardening of the coarse grained elements of the system. Here, we investigate the effects of transient hardening on (i) the activity rate and (ii) the statistical properties of avalanches. We find the explicit representation of transient hardening in the presence of generalized viscoelasticity and solve the corresponding mean-field model of fracture. In the quasistatic limit, the accelerated energy release is invariant with respect to rheology and the avalanche propagation can be reinterpreted in terms of a stochastic counting process. A single universality class can be defined from such analogy, and all statistical properties depend only on the distance to criticality. We also prove that interevent correlations emerge due to the hardening—even in the quasistatic limit—that can be interpreted as "aftershocks" and "foreshocks."

  20. The effects of neuron morphology on graph theoretic measures of network connectivity: the analysis of a two-level statistical model.

    PubMed

    Aćimović, Jugoslava; Mäki-Marttunen, Tuomo; Linne, Marja-Leena

    2015-01-01

    We developed a two-level statistical model that addresses the question of how properties of neurite morphology shape the large-scale network connectivity. We adopted a low-dimensional statistical description of neurites. From the neurite model description we derived the expected number of synapses, node degree, and the effective radius, the maximal distance between two neurons expected to form at least one synapse. We related these quantities to the network connectivity described using standard measures from graph theory, such as motif counts, clustering coefficient, minimal path length, and small-world coefficient. These measures are used in a neuroscience context to study phenomena from synaptic connectivity in the small neuronal networks to large scale functional connectivity in the cortex. For these measures we provide analytical solutions that clearly relate different model properties. Neurites that sparsely cover space lead to a small effective radius. If the effective radius is small compared to the overall neuron size the obtained networks share similarities with the uniform random networks as each neuron connects to a small number of distant neurons. Large neurites with densely packed branches lead to a large effective radius. If this effective radius is large compared to the neuron size, the obtained networks have many local connections. In between these extremes, the networks maximize the variability of connection repertoires. The presented approach connects the properties of neuron morphology with large scale network properties without requiring heavy simulations with many model parameters. The two-steps procedure provides an easier interpretation of the role of each modeled parameter. The model is flexible and each of its components can be further expanded. We identified a range of model parameters that maximizes variability in network connectivity, the property that might affect network capacity to exhibit different dynamical regimes.

Top