Sample records for alternative statistical models

  1. Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability.

    PubMed

    Coman, Emil N; Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J; Suggs, Suzanne; Barbour, Russell

    2014-05-01

    The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power.

  2. Right-Sizing Statistical Models for Longitudinal Data

    PubMed Central

    Wood, Phillip K.; Steinley, Douglas; Jackson, Kristina M.

    2015-01-01

    Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to “right-size” the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting overly parsimonious models to more complex better fitting alternatives, and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically under-identified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A three-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation/covariation patterns. The orthogonal, free-curve slope-intercept (FCSI) growth model is considered as a general model which includes, as special cases, many models including the Factor Mean model (FM, McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, Hierarchical Linear Models (HLM), Repeated Measures MANOVA, and the Linear Slope Intercept (LinearSI) Growth Model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparison of several candidate parametric growth and chronometric models in a Monte Carlo study. PMID:26237507

  3. Right-sizing statistical models for longitudinal data.

    PubMed

    Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M

    2015-12-01

    Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).

  4. Statistical Cost Estimation in Higher Education: Some Alternatives.

    ERIC Educational Resources Information Center

    Brinkman, Paul T.; Niwa, Shelley

    Recent developments in econometrics that are relevant to the task of estimating costs in higher education are reviewed. The relative effectiveness of alternative statistical procedures for estimating costs are also tested. Statistical cost estimation involves three basic parts: a model, a data set, and an estimation procedure. Actual data are used…

  5. Alignment-free sequence comparison (II): theoretical power of comparison statistics.

    PubMed

    Wan, Lin; Reinert, Gesine; Sun, Fengzhu; Waterman, Michael S

    2010-11-01

    Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.

  6. Alternative Statistical Frameworks for Student Growth Percentile Estimation

    ERIC Educational Resources Information Center

    Lockwood, J. R.; Castellano, Katherine E.

    2015-01-01

    This article suggests two alternative statistical approaches for estimating student growth percentiles (SGP). The first is to estimate percentile ranks of current test scores conditional on past test scores directly, by modeling the conditional cumulative distribution functions, rather than indirectly through quantile regressions. This would…

  7. Modified Likelihood-Based Item Fit Statistics for the Generalized Graded Unfolding Model

    ERIC Educational Resources Information Center

    Roberts, James S.

    2008-01-01

    Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…

  8. Testing alternative ground water models using cross-validation and other methods

    USGS Publications Warehouse

    Foglia, L.; Mehl, S.W.; Hill, M.C.; Perona, P.; Burlando, P.

    2007-01-01

    Many methods can be used to test alternative ground water models. Of concern in this work are methods able to (1) rank alternative models (also called model discrimination) and (2) identify observations important to parameter estimates and predictions (equivalent to the purpose served by some types of sensitivity analysis). Some of the measures investigated are computationally efficient; others are computationally demanding. The latter are generally needed to account for model nonlinearity. The efficient model discrimination methods investigated include the information criteria: the corrected Akaike information criterion, Bayesian information criterion, and generalized cross-validation. The efficient sensitivity analysis measures used are dimensionless scaled sensitivity (DSS), composite scaled sensitivity, and parameter correlation coefficient (PCC); the other statistics are DFBETAS, Cook's D, and observation-prediction statistic. Acronyms are explained in the introduction. Cross-validation (CV) is a computationally intensive nonlinear method that is used for both model discrimination and sensitivity analysis. The methods are tested using up to five alternative parsimoniously constructed models of the ground water system of the Maggia Valley in southern Switzerland. The alternative models differ in their representation of hydraulic conductivity. A new method for graphically representing CV and sensitivity analysis results for complex models is presented and used to evaluate the utility of the efficient statistics. The results indicate that for model selection, the information criteria produce similar results at much smaller computational cost than CV. For identifying important observations, the only obviously inferior linear measure is DSS; the poor performance was expected because DSS does not include the effects of parameter correlation and PCC reveals large parameter correlations. ?? 2007 National Ground Water Association.

  9. Testing Transitivity of Preferences on Two-Alternative Forced Choice Data

    PubMed Central

    Regenwetter, Michel; Dana, Jason; Davis-Stober, Clintin P.

    2010-01-01

    As Duncan Luce and other prominent scholars have pointed out on several occasions, testing algebraic models against empirical data raises difficult conceptual, mathematical, and statistical challenges. Empirical data often result from statistical sampling processes, whereas algebraic theories are nonprobabilistic. Many probabilistic specifications lead to statistical boundary problems and are subject to nontrivial order constrained statistical inference. The present paper discusses Luce's challenge for a particularly prominent axiom: Transitivity. The axiom of transitivity is a central component in many algebraic theories of preference and choice. We offer the currently most complete solution to the challenge in the case of transitivity of binary preference on the theory side and two-alternative forced choice on the empirical side, explicitly for up to five, and implicitly for up to seven, choice alternatives. We also discuss the relationship between our proposed solution and weak stochastic transitivity. We recommend to abandon the latter as a model of transitive individual preferences. PMID:21833217

  10. A simulations approach for meta-analysis of genetic association studies based on additive genetic model.

    PubMed

    John, Majnu; Lencz, Todd; Malhotra, Anil K; Correll, Christoph U; Zhang, Jian-Ping

    2018-06-01

    Meta-analysis of genetic association studies is being increasingly used to assess phenotypic differences between genotype groups. When the underlying genetic model is assumed to be dominant or recessive, assessing the phenotype differences based on summary statistics, reported for individual studies in a meta-analysis, is a valid strategy. However, when the genetic model is additive, a similar strategy based on summary statistics will lead to biased results. This fact about the additive model is one of the things that we establish in this paper, using simulations. The main goal of this paper is to present an alternate strategy for the additive model based on simulating data for the individual studies. We show that the alternate strategy is far superior to the strategy based on summary statistics.

  11. Fast, Statistical Model of Surface Roughness for Ion-Solid Interaction Simulations and Efficient Code Coupling

    NASA Astrophysics Data System (ADS)

    Drobny, Jon; Curreli, Davide; Ruzic, David; Lasa, Ane; Green, David; Canik, John; Younkin, Tim; Blondel, Sophie; Wirth, Brian

    2017-10-01

    Surface roughness greatly impacts material erosion, and thus plays an important role in Plasma-Surface Interactions. Developing strategies for efficiently introducing rough surfaces into ion-solid interaction codes will be an important step towards whole-device modeling of plasma devices and future fusion reactors such as ITER. Fractal TRIDYN (F-TRIDYN) is an upgraded version of the Monte Carlo, BCA program TRIDYN developed for this purpose that includes an explicit fractal model of surface roughness and extended input and output options for file-based code coupling. Code coupling with both plasma and material codes has been achieved and allows for multi-scale, whole-device modeling of plasma experiments. These code coupling results will be presented. F-TRIDYN has been further upgraded with an alternative, statistical model of surface roughness. The statistical model is significantly faster than and compares favorably to the fractal model. Additionally, the statistical model compares well to alternative computational surface roughness models and experiments. Theoretical links between the fractal and statistical models are made, and further connections to experimental measurements of surface roughness are explored. This work was supported by the PSI-SciDAC Project funded by the U.S. Department of Energy through contract DOE-DE-SC0008658.

  12. Congruence analysis of geodetic networks - hypothesis tests versus model selection by information criteria

    NASA Astrophysics Data System (ADS)

    Lehmann, Rüdiger; Lösler, Michael

    2017-12-01

    Geodetic deformation analysis can be interpreted as a model selection problem. The null model indicates that no deformation has occurred. It is opposed to a number of alternative models, which stipulate different deformation patterns. A common way to select the right model is the usage of a statistical hypothesis test. However, since we have to test a series of deformation patterns, this must be a multiple test. As an alternative solution for the test problem, we propose the p-value approach. Another approach arises from information theory. Here, the Akaike information criterion (AIC) or some alternative is used to select an appropriate model for a given set of observations. Both approaches are discussed and applied to two test scenarios: A synthetic levelling network and the Delft test data set. It is demonstrated that they work but behave differently, sometimes even producing different results. Hypothesis tests are well-established in geodesy, but may suffer from an unfavourable choice of the decision error rates. The multiple test also suffers from statistical dependencies between the test statistics, which are neglected. Both problems are overcome by applying information criterions like AIC.

  13. Power Enhancement in High Dimensional Cross-Sectional Tests

    PubMed Central

    Fan, Jianqing; Liao, Yuan; Yao, Jiawei

    2016-01-01

    We propose a novel technique to boost the power of testing a high-dimensional vector H : θ = 0 against sparse alternatives where the null hypothesis is violated only by a couple of components. Existing tests based on quadratic forms such as the Wald statistic often suffer from low powers due to the accumulation of errors in estimating high-dimensional parameters. More powerful tests for sparse alternatives such as thresholding and extreme-value tests, on the other hand, require either stringent conditions or bootstrap to derive the null distribution and often suffer from size distortions due to the slow convergence. Based on a screening technique, we introduce a “power enhancement component”, which is zero under the null hypothesis with high probability, but diverges quickly under sparse alternatives. The proposed test statistic combines the power enhancement component with an asymptotically pivotal statistic, and strengthens the power under sparse alternatives. The null distribution does not require stringent regularity conditions, and is completely determined by that of the pivotal statistic. As specific applications, the proposed methods are applied to testing the factor pricing models and validating the cross-sectional independence in panel data models. PMID:26778846

  14. The Impact of Social Desirability on Organizational Behavior Research Results: An Empirical Investigation of Alternative Models,

    DTIC Science & Technology

    1982-02-01

    TANAAD S t(AB A , THE IMPACT OF SOCIAL DESIRABILITY ON ORGANIZATIONAL BEHAVIOR RESEARCH RESULTS: AN EMPIRICAL INVESTIGATION OF ALTERNATIVE MODELS Daniel C...Investigation of Alternative Models 4- PERFORMING G. RE.PORTNUMBER 7. AU ?wnOPe) a. CONTRACf OR GR1ANT NUBAe Daniel C. Ganster, Harry W. Hennessey, and Fred...Fred Luthans University of Nebraska-Lincoln ABSTRACT Three conceptual and statistical models are developed for the effects of social desirability (SD

  15. Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments

    NASA Technical Reports Server (NTRS)

    Abbey, Craig K.; Eckstein, Miguel P.

    2002-01-01

    We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.

  16. Alternating Renewal Process Models for Behavioral Observation: Simulation Methods, Software, and Validity Illustrations

    ERIC Educational Resources Information Center

    Pustejovsky, James E.; Runyon, Christopher

    2014-01-01

    Direct observation recording procedures produce reductive summary measurements of an underlying stream of behavior. Previous methodological studies of these recording procedures have employed simulation methods for generating random behavior streams, many of which amount to special cases of a statistical model known as the alternating renewal…

  17. Statistics for characterizing data on the periphery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Theiler, James P; Hush, Donald R

    2010-01-01

    We introduce a class of statistics for characterizing the periphery of a distribution, and show that these statistics are particularly valuable for problems in target detection. Because so many detection algorithms are rooted in Gaussian statistics, we concentrate on ellipsoidal models of high-dimensional data distributions (that is to say: covariance matrices), but we recommend several alternatives to the sample covariance matrix that more efficiently model the periphery of a distribution, and can more effectively detect anomalous data samples.

  18. Statistical framework for evaluation of climate model simulations by use of climate proxy data from the last millennium - Part 1: Theory

    NASA Astrophysics Data System (ADS)

    Sundberg, R.; Moberg, A.; Hind, A.

    2012-08-01

    A statistical framework for comparing the output of ensemble simulations from global climate models with networks of climate proxy and instrumental records has been developed, focusing on near-surface temperatures for the last millennium. This framework includes the formulation of a joint statistical model for proxy data, instrumental data and simulation data, which is used to optimize a quadratic distance measure for ranking climate model simulations. An essential underlying assumption is that the simulations and the proxy/instrumental series have a shared component of variability that is due to temporal changes in external forcing, such as volcanic aerosol load, solar irradiance or greenhouse gas concentrations. Two statistical tests have been formulated. Firstly, a preliminary test establishes whether a significant temporal correlation exists between instrumental/proxy and simulation data. Secondly, the distance measure is expressed in the form of a test statistic of whether a forced simulation is closer to the instrumental/proxy series than unforced simulations. The proposed framework allows any number of proxy locations to be used jointly, with different seasons, record lengths and statistical precision. The goal is to objectively rank several competing climate model simulations (e.g. with alternative model parameterizations or alternative forcing histories) by means of their goodness of fit to the unobservable true past climate variations, as estimated from noisy proxy data and instrumental observations.

  19. The New Alternative DSM-5 Model for Personality Disorders: Issues and Controversies

    ERIC Educational Resources Information Center

    Porter, Jeffrey S.; Risler, Edwin

    2014-01-01

    Purpose: Assess the new alternative "Diagnostic and Statistical Manual of Mental Disorders", fifth edition (DSM-5) model for personality disorders (PDs) as it is seen by its creators and critics. Method: Follow the DSM revision process by monitoring the American Psychiatric Association website and the publication of pertinent journal…

  20. A Monte Carlo Simulation Comparing the Statistical Precision of Two High-Stakes Teacher Evaluation Methods: A Value-Added Model and a Composite Measure

    ERIC Educational Resources Information Center

    Spencer, Bryden

    2016-01-01

    Value-added models are a class of growth models used in education to assign responsibility for student growth to teachers or schools. For value-added models to be used fairly, sufficient statistical precision is necessary for accurate teacher classification. Previous research indicated precision below practical limits. An alternative approach has…

  1. Two-sample statistics for testing the equality of survival functions against improper semi-parametric accelerated failure time alternatives: an application to the analysis of a breast cancer clinical trial.

    PubMed

    Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry

    2004-06-01

    This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.

  2. Capturing the DSM-5 Alternative Personality Disorder Model Traits in the Five-Factor Model's Nomological Net.

    PubMed

    Suzuki, Takakuni; Griffin, Sarah A; Samuel, Douglas B

    2017-04-01

    Several studies have shown structural and statistical similarities between the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) alternative personality disorder model and the Five-Factor Model (FFM). However, no study to date has evaluated the nomological network similarities between the two models. The relations of the Revised NEO Personality Inventory (NEO PI-R) and the Personality Inventory for DSM-5 (PID-5) with relevant criterion variables were examined in a sample of 336 undergraduate students (M age  = 19.4; 59.8% female). The resulting profiles for each instrument were statistically compared for similarity. Four of the five domains of the two models have highly similar nomological networks, with the exception being FFM Openness to Experience and PID-5 Psychoticism. Further probing of that pair suggested that the NEO PI-R domain scores obscured meaningful similarity between PID-5 Psychoticism and specific aspects and lower-order facets of Openness. The results support the notion that the DSM-5 alternative personality disorder model trait domains represent variants of the FFM domains. Similarities of Openness and Psychoticism domains were supported when the lower-order aspects and facets of Openness domain were considered. The findings support the view that the DSM-5 trait model represents an instantiation of the FFM. © 2015 Wiley Periodicals, Inc.

  3. A nonparametric spatial scan statistic for continuous data.

    PubMed

    Jung, Inkyung; Cho, Ho Jin

    2015-10-20

    Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.

  4. Two-Sample Statistics for Testing the Equality of Survival Functions Against Improper Semi-parametric Accelerated Failure Time Alternatives: An Application to the Analysis of a Breast Cancer Clinical Trial

    PubMed Central

    BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY

    2010-01-01

    This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627

  5. Teacher Evaluation: Alternate Measures of Student Growth. Q&A with Brian Gill. REL Mid-Atlantic Webinar

    ERIC Educational Resources Information Center

    Regional Educational Laboratory Mid-Atlantic, 2013

    2013-01-01

    This webinar described the findings of our literature review on alternative measures of student growth that are used in teacher evaluation. The review focused on two types of alternative growth measures: statistical growth/value-added models and teacher-developed student learning objectives. This Q&A addressed the questions participants had…

  6. Model fit evaluation in multilevel structural equation models

    PubMed Central

    Ryu, Ehri

    2014-01-01

    Assessing goodness of model fit is one of the key questions in structural equation modeling (SEM). Goodness of fit is the extent to which the hypothesized model reproduces the multivariate structure underlying the set of variables. During the earlier development of multilevel structural equation models, the “standard” approach was to evaluate the goodness of fit for the entire model across all levels simultaneously. The model fit statistics produced by the standard approach have a potential problem in detecting lack of fit in the higher-level model for which the effective sample size is much smaller. Also when the standard approach results in poor model fit, it is not clear at which level the model does not fit well. This article reviews two alternative approaches that have been proposed to overcome the limitations of the standard approach. One is a two-step procedure which first produces estimates of saturated covariance matrices at each level and then performs single-level analysis at each level with the estimated covariance matrices as input (Yuan and Bentler, 2007). The other level-specific approach utilizes partially saturated models to obtain test statistics and fit indices for each level separately (Ryu and West, 2009). Simulation studies (e.g., Yuan and Bentler, 2007; Ryu and West, 2009) have consistently shown that both alternative approaches performed well in detecting lack of fit at any level, whereas the standard approach failed to detect lack of fit at the higher level. It is recommended that the alternative approaches are used to assess the model fit in multilevel structural equation model. Advantages and disadvantages of the two alternative approaches are discussed. The alternative approaches are demonstrated in an empirical example. PMID:24550882

  7. Development Of Educational Programs In Renewable And Alternative Energy Processing: The Case Of Russia

    NASA Astrophysics Data System (ADS)

    Svirina, Anna; Shindor, Olga; Tatmyshevsky, Konstantin

    2014-12-01

    The paper deals with the main problems of Russian energy system development that proves necessary to provide educational programs in the field of renewable and alternative energy. In the paper the process of curricula development and defining teaching techniques on the basis of expert opinion evaluation is defined, and the competence model for renewable and alternative energy processing master students is suggested. On the basis of a distributed questionnaire and in-depth interviews, the data for statistical analysis was obtained. On the basis of this data, an optimization of curricula structure was performed, and three models of a structure for optimizing teaching techniques were developed. The suggested educational program structure which was adopted by employers is presented in the paper. The findings include quantitatively estimated importance of systemic thinking and professional skills and knowledge as basic competences of a masters' program graduate; statistically estimated necessity of practice-based learning approach; and optimization models for structuring curricula in renewable and alternative energy processing. These findings allow the establishment of a platform for the development of educational programs.

  8. An Alternative to the 3PL: Using Asymmetric Item Characteristic Curves to Address Guessing Effects

    ERIC Educational Resources Information Center

    Lee, Sora; Bolt, Daniel M.

    2018-01-01

    Both the statistical and interpretational shortcomings of the three-parameter logistic (3PL) model in accommodating guessing effects on multiple-choice items are well documented. We consider the use of a residual heteroscedasticity (RH) model as an alternative, and compare its performance to the 3PL with real test data sets and through simulation…

  9. Reassessing the NTCTCS Staging Systems for Differentiated Thyroid Cancer, Including Age at Diagnosis

    PubMed Central

    McLeod, Donald S.A.; Jonklaas, Jacqueline; Brierley, James D.; Ain, Kenneth B.; Cooper, David S.; Fein, Henry G.; Haugen, Bryan R.; Ladenson, Paul W.; Magner, James; Ross, Douglas S.; Skarulis, Monica C.; Steward, David L.; Xing, Mingzhao; Litofsky, Danielle R.; Maxon, Harry R.

    2015-01-01

    Background: Thyroid cancer is unique for having age as a staging variable. Recently, the commonly used age cut-point of 45 years has been questioned. Objective: This study assessed alternate staging systems on the outcome of overall survival, and compared these with current National Thyroid Cancer Treatment Cooperative Study (NTCTCS) staging systems for papillary and follicular thyroid cancer. Methods: A total of 4721 patients with differentiated thyroid cancer were assessed. Five potential alternate staging systems were generated at age cut-points in five-year increments from 35 to 70 years, and tested for model discrimination (Harrell's C-statistic) and calibration (R2). The best five models for papillary and follicular cancer were further tested with bootstrap resampling and significance testing for discrimination. Results: The best five alternate papillary cancer systems had age cut-points of 45–50 years, with the highest scoring model using 50 years. No significant difference in C-statistic was found between the best alternate and current NTCTCS systems (p = 0.200). The best five alternate follicular cancer systems had age cut-points of 50–55 years, with the highest scoring model using 50 years. All five best alternate staging systems performed better compared with the current system (p = 0.003–0.035). There was no significant difference in discrimination between the best alternate system (cut-point age 50 years) and the best system of cut-point age 45 years (p = 0.197). Conclusions: No alternate papillary cancer systems assessed were significantly better than the current system. New alternate staging systems for follicular cancer appear to be better than the current NTCTCS system, although they require external validation. PMID:26203804

  10. rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data.

    PubMed

    Shen, Shihao; Park, Juw Won; Lu, Zhi-xiang; Lin, Lan; Henry, Michael D; Wu, Ying Nian; Zhou, Qing; Xing, Yi

    2014-12-23

    Ultra-deep RNA sequencing (RNA-Seq) has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We previously developed multivariate analysis of transcript splicing (MATS), a statistical method for detecting differential alternative splicing between two RNA-Seq samples. Here we describe a new statistical model and computer program, replicate MATS (rMATS), designed for detection of differential alternative splicing from replicate RNA-Seq data. rMATS uses a hierarchical model to simultaneously account for sampling uncertainty in individual replicates and variability among replicates. In addition to the analysis of unpaired replicates, rMATS also includes a model specifically designed for paired replicates between sample groups. The hypothesis-testing framework of rMATS is flexible and can assess the statistical significance over any user-defined magnitude of splicing change. The performance of rMATS is evaluated by the analysis of simulated and real RNA-Seq data. rMATS outperformed two existing methods for replicate RNA-Seq data in all simulation settings, and RT-PCR yielded a high validation rate (94%) in an RNA-Seq dataset of prostate cancer cell lines. Our data also provide guiding principles for designing RNA-Seq studies of alternative splicing. We demonstrate that it is essential to incorporate biological replicates in the study design. Of note, pooling RNAs or merging RNA-Seq data from multiple replicates is not an effective approach to account for variability, and the result is particularly sensitive to outliers. The rMATS source code is freely available at rnaseq-mats.sourceforge.net/. As the popularity of RNA-Seq continues to grow, we expect rMATS will be useful for studies of alternative splicing in diverse RNA-Seq projects.

  11. A Confirmatory Factor Analysis of the Structure of Statistics Anxiety Measure: An examination of four alternative models

    PubMed Central

    Vahedi, Shahram; Farrokhi, Farahman

    2011-01-01

    Objective The aim of this study is to explore the confirmatory factor analysis results of the Persian adaptation of Statistics Anxiety Measure (SAM), proposed by Earp. Method The validity and reliability assessments of the scale were performed on 298 college students chosen randomly from Tabriz University in Iran. Confirmatory factor analysis (CFA) was carried out to determine the factor structures of the Persian adaptation of SAM. Results As expected, the second order model provided a better fit to the data than the three alternative models. Conclusions Hence, SAM provides an equally valid measure for use among college students. The study both expands and adds support to the existing body of math anxiety literature. PMID:22952530

  12. Alternative approaches to predicting methane emissions from dairy cows.

    PubMed

    Mills, J A N; Kebreab, E; Yates, C M; Crompton, L A; Cammell, S B; Dhanoa, M S; Agnew, R E; France, J

    2003-12-01

    Previous attempts to apply statistical models, which correlate nutrient intake with methane production, have been of limited value where predictions are obtained for nutrient intakes and diet types outside those used in model construction. Dynamic mechanistic models have proved more suitable for extrapolation, but they remain computationally expensive and are not applied easily in practical situations. The first objective of this research focused on employing conventional techniques to generate statistical models of methane production appropriate to United Kingdom dairy systems. The second objective was to evaluate these models and a model published previously using both United Kingdom and North American data sets. Thirdly, nonlinear models were considered as alternatives to the conventional linear regressions. The United Kingdom calorimetry data used to construct the linear models also were used to develop the three nonlinear alternatives that were all of modified Mitscherlich (monomolecular) form. Of the linear models tested, an equation from the literature proved most reliable across the full range of evaluation data (root mean square prediction error = 21.3%). However, the Mitscherlich models demonstrated the greatest degree of adaptability across diet types and intake level. The most successful model for simulating the independent data was a modified Mitscherlich equation with the steepness parameter set to represent dietary starch-to-ADF ratio (root mean square prediction error = 20.6%). However, when such data were unavailable, simpler Mitscherlich forms relating dry matter or metabolizable energy intake to methane production remained better alternatives relative to their linear counterparts.

  13. STATISTICAL METHODOLOGY FOR ESTIMATING PARAMETERS IN PBPK/PD MODELS

    EPA Science Inventory

    PBPK/PD models are large dynamic models that predict tissue concentration and biological effects of a toxicant before PBPK/PD models can be used in risk assessments in the arena of toxicological hypothesis testing, models allow the consequences of alternative mechanistic hypothes...

  14. Evaluating pictogram prediction in a location-aware augmentative and alternative communication system.

    PubMed

    Garcia, Luís Filipe; de Oliveira, Luís Caldas; de Matos, David Martins

    2016-01-01

    This study compared the performance of two statistical location-aware pictogram prediction mechanisms, with an all-purpose (All) pictogram prediction mechanism, having no location knowledge. The All approach had a unique language model under all locations. One of the location-aware alternatives, the location-specific (Spec) approach, made use of specific language models for pictogram prediction in each location of interest. The other location-aware approach resulted from combining the Spec and the All approaches, and was designated the mixed approach (Mix). In this approach, the language models acquired knowledge from all locations, but a higher relevance was assigned to the vocabulary from the associated location. Results from simulations showed that the Mix and Spec approaches could only outperform the baseline in a statistically significant way if pictogram users reuse more than 50% and 75% of their sentences, respectively. Under low sentence reuse conditions there were no statistically significant differences between the location-aware approaches and the All approach. Under these conditions, the Mix approach performed better than the Spec approach in a statistically significant way.

  15. Linear models: permutation methods

    USGS Publications Warehouse

    Cade, B.S.; Everitt, B.S.; Howell, D.C.

    2005-01-01

    Permutation tests (see Permutation Based Inference) for the linear model have applications in behavioral studies when traditional parametric assumptions about the error term in a linear model are not tenable. Improved validity of Type I error rates can be achieved with properly constructed permutation tests. Perhaps more importantly, increased statistical power, improved robustness to effects of outliers, and detection of alternative distributional differences can be achieved by coupling permutation inference with alternative linear model estimators. For example, it is well-known that estimates of the mean in linear model are extremely sensitive to even a single outlying value of the dependent variable compared to estimates of the median [7, 19]. Traditionally, linear modeling focused on estimating changes in the center of distributions (means or medians). However, quantile regression allows distributional changes to be estimated in all or any selected part of a distribution or responses, providing a more complete statistical picture that has relevance to many biological questions [6]...

  16. Syndromic surveillance models using Web data: the case of scarlet fever in the UK.

    PubMed

    Samaras, Loukas; García-Barriocanal, Elena; Sicilia, Miguel-Angel

    2012-03-01

    Recent research has shown the potential of Web queries as a source for syndromic surveillance, and existing studies show that these queries can be used as a basis for estimation and prediction of the development of a syndromic disease, such as influenza, using log linear (logit) statistical models. Two alternative models are applied to the relationship between cases and Web queries in this paper. We examine the applicability of using statistical methods to relate search engine queries with scarlet fever cases in the UK, taking advantage of tools to acquire the appropriate data from Google, and using an alternative statistical method based on gamma distributions. The results show that using logit models, the Pearson correlation factor between Web queries and the data obtained from the official agencies must be over 0.90, otherwise the prediction of the peak and the spread of the distributions gives significant deviations. In this paper, we describe the gamma distribution model and show that we can obtain better results in all cases using gamma transformations, and especially in those with a smaller correlation factor.

  17. Nonparametric estimation and testing of fixed effects panel data models

    PubMed Central

    Henderson, Daniel J.; Carroll, Raymond J.; Li, Qi

    2009-01-01

    In this paper we consider the problem of estimating nonparametric panel data models with fixed effects. We introduce an iterative nonparametric kernel estimator. We also extend the estimation method to the case of a semiparametric partially linear fixed effects model. To determine whether a parametric, semiparametric or nonparametric model is appropriate, we propose test statistics to test between the three alternatives in practice. We further propose a test statistic for testing the null hypothesis of random effects against fixed effects in a nonparametric panel data regression model. Simulations are used to examine the finite sample performance of the proposed estimators and the test statistics. PMID:19444335

  18. Personality disorders in DSM-5: emerging research on the alternative model.

    PubMed

    Morey, Leslie C; Benson, Kathryn T; Busch, Alexander J; Skodol, Andrew E

    2015-04-01

    The current categorical classification of personality disorders, originally introduced in the Diagnostic and Statistical Manual of Mental Disorders (DSM-III), has been found to suffer from numerous shortcomings that hamper its usefulness for research and for clinical application. The Personality and Personality Disorders Work Group for DSM-5 was charged with developing an alternative model that would address many of these concerns. The developed model involved a hybrid dimensional/categorical model that represented personality disorders as combinations of core impairments in personality functioning with specific configurations of problematic personality traits. The Board of Trustees of the American Psychiatric Association did not accept the Task Force recommendation to implement this novel approach, and thus this alternative model was included in Sect. III of the DSM-5 among concepts requiring additional study. This review provides an overview of the emerging research on this alternative model, addressing each of the primary components of the model.

  19. Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule

    PubMed Central

    Benitez, Kathleen; Masys, Daniel

    2010-01-01

    Objective Healthcare organizations must de-identify patient records before sharing data. Many organizations rely on the Safe Harbor Standard of the HIPAA Privacy Rule, which enumerates 18 identifiers that must be suppressed (eg, ages over 89). An alternative model in the Privacy Rule, known as the Statistical Standard, can facilitate the sharing of more detailed data, but is rarely applied because of a lack of published methodologies. The authors propose an intuitive approach to de-identifying patient demographics in accordance with the Statistical Standard. Design The authors conduct an analysis of the demographics of patient cohorts in five medical centers developed for the NIH-sponsored Electronic Medical Records and Genomics network, with respect to the US census. They report the re-identification risk of patient demographics disclosed according to the Safe Harbor policy and the relative risk rate for sharing such information via alternative policies. Measurements The re-identification risk of Safe Harbor demographics ranged from 0.01% to 0.19%. The findings show alternative de-identification models can be created with risks no greater than Safe Harbor. The authors illustrate that the disclosure of patient ages over the age of 89 is possible when other features are reduced in granularity. Limitations The de-identification approach described in this paper was evaluated with demographic data only and should be evaluated with other potential identifiers. Conclusion Alternative de-identification policies to the Safe Harbor model can be derived for patient demographics to enable the disclosure of values that were previously suppressed. The method is generalizable to any environment in which population statistics are available. PMID:21169618

  20. Mourning dove hunting regulation strategy based on annual harvest statistics and banding data

    USGS Publications Warehouse

    Otis, D.L.

    2006-01-01

    Although managers should strive to base game bird harvest management strategies on mechanistic population models, monitoring programs required to build and continuously update these models may not be in place. Alternatively, If estimates of total harvest and harvest rates are available, then population estimates derived from these harvest data can serve as the basis for making hunting regulation decisions based on population growth rates derived from these estimates. I present a statistically rigorous approach for regulation decision-making using a hypothesis-testing framework and an assumed framework of 3 hunting regulation alternatives. I illustrate and evaluate the technique with historical data on the mid-continent mallard (Anas platyrhynchos) population. I evaluate the statistical properties of the hypothesis-testing framework using the best available data on mourning doves (Zenaida macroura). I use these results to discuss practical implementation of the technique as an interim harvest strategy for mourning doves until reliable mechanistic population models and associated monitoring programs are developed.

  1. Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research.

    PubMed

    Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann

    2003-01-01

    Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.

  2. Bayesian models based on test statistics for multiple hypothesis testing problems.

    PubMed

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  3. A Constrained Linear Estimator for Multiple Regression

    ERIC Educational Resources Information Center

    Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

    2010-01-01

    "Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

  4. Establishment of a center of excellence for applied mathematical and statistical research

    NASA Technical Reports Server (NTRS)

    Woodward, W. A.; Gray, H. L.

    1983-01-01

    The state of the art was assessed with regards to efforts in support of the crop production estimation problem and alternative generic proportion estimation techniques were investigated. Topics covered include modeling the greeness profile (Badhwarmos model), parameter estimation using mixture models such as CLASSY, and minimum distance estimation as an alternative to maximum likelihood estimation. Approaches to the problem of obtaining proportion estimates when the underlying distributions are asymmetric are examined including the properties of Weibull distribution.

  5. Bridging the gap between habitat-modeling research and bird conservation with dynamic landscape and population models

    Treesearch

    Frank R., III Thompson

    2009-01-01

    Habitat models are widely used in bird conservation planning to assess current habitat or populations and to evaluate management alternatives. These models include species-habitat matrix or database models, habitat suitability models, and statistical models that predict abundance. While extremely useful, these approaches have some limitations.

  6. Exploring the Assessment of the DSM-5 Alternative Model for Personality Disorders With the Personality Assessment Inventory.

    PubMed

    Busch, Alexander J; Morey, Leslie C; Hopwood, Christopher J

    2017-01-01

    Section III of the Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM-5]; American Psychiatric Association, 2013) contains an alternative model for the diagnosis of personality disorder involving the assessment of 25 traits and a global level of overall personality functioning. There is hope that this model will be increasingly used in clinical and research settings, and the ability to apply established instruments to assess these concepts could facilitate this process. This study sought to develop scoring algorithms for these alternative model concepts using scales from the Personality Assessment Inventory (PAI). A multiple regression strategy used to predict scores in 2 undergraduate samples on DSM-5 alternative model instruments: the Personality Inventory for the DSM-5 (PID-5) and the General Personality Pathology scale (GPP; Morey et al., 2011 ). These regression functions resulted in scores that demonstrated promising convergent and discriminant validity across the alternative model concepts, as well as a factor structure in a cross-validation sample that was congruent with the putative structure of the alternative model traits. Results were linked to the PAI community normative data to provide normative information regarding these alternative model concepts that can be used to identify elevated traits and personality functioning level scores.

  7. Evaluating Teachers and Schools Using Student Growth Models

    ERIC Educational Resources Information Center

    Schafer, William D.; Lissitz, Robert W.; Zhu, Xiaoshu; Zhang, Yuan; Hou, Xiaodong; Li, Ying

    2012-01-01

    Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year…

  8. Specification, testing, and interpretation of gene-by-measured-environment interaction models in the presence of gene-environment correlation

    PubMed Central

    Rathouz, Paul J.; Van Hulle, Carol A.; Lee Rodgers, Joseph; Waldman, Irwin D.; Lahey, Benjamin B.

    2009-01-01

    Purcell (2002) proposed a bivariate biometric model for testing and quantifying the interaction between latent genetic influences and measured environments in the presence of gene-environment correlation. Purcell’s model extends the Cholesky model to include gene-environment interaction. We examine a number of closely-related alternative models that do not involve gene-environment interaction but which may fit the data as well Purcell’s model. Because failure to consider these alternatives could lead to spurious detection of gene-environment interaction, we propose alternative models for testing gene-environment interaction in the presence of gene-environment correlation, including one based on the correlated factors model. In addition, we note mathematical errors in the calculation of effect size via variance components in Purcell’s model. We propose a statistical method for deriving and interpreting variance decompositions that are true to the fitted model. PMID:18293078

  9. Modelling the effect of structural QSAR parameters on skin penetration using genetic programming

    NASA Astrophysics Data System (ADS)

    Chung, K. K.; Do, D. Q.

    2010-09-01

    In order to model relationships between chemical structures and biological effects in quantitative structure-activity relationship (QSAR) data, an alternative technique of artificial intelligence computing—genetic programming (GP)—was investigated and compared to the traditional method—statistical. GP, with the primary advantage of generating mathematical equations, was employed to model QSAR data and to define the most important molecular descriptions in QSAR data. The models predicted by GP agreed with the statistical results, and the most predictive models of GP were significantly improved when compared to the statistical models using ANOVA. Recently, artificial intelligence techniques have been applied widely to analyse QSAR data. With the capability of generating mathematical equations, GP can be considered as an effective and efficient method for modelling QSAR data.

  10. Confronting Alternative Cosmological Models with the Highest-Redshift Type Ia Supernovae

    NASA Astrophysics Data System (ADS)

    Shafer, Daniel; Scolnic, Daniel; Riess, Adam

    2018-01-01

    High-redshift Type Ia supernovae (SNe Ia) from the HST CANDELS and CLASH programs significantly extend the Hubble diagram with 7 SNe at z > 1.5 suitable for cosmology, including one at z = 2.3. This unique leverage helps us distinguish "alternative" cosmological models from the standard Lambda-CDM model. Analyzing the Pantheon SN compilation, which includes these high-z SNe, we employ model comparison statistics to quantify the extent to which several proposed alternative expansion histories (e.g., empty universe, power law expansion, timescape cosmology) are disfavored even with SN Ia data alone. Using mock data, we demonstrate that some likelihood analyses used in the literature to support these models are sensitive to unrealistic assumptions and are therefore unsuitable for analysis of realistic SN Ia data.

  11. Variable system: An alternative approach for the analysis of mediated moderation.

    PubMed

    Kwan, Joyce Lok Yin; Chan, Wai

    2018-06-01

    Mediated moderation (meMO) occurs when the moderation effect of the moderator (W) on the relationship between the independent variable (X) and the dependent variable (Y) is transmitted through a mediator (M). To examine this process empirically, 2 different model specifications (Type I meMO and Type II meMO) have been proposed in the literature. However, both specifications are found to be problematic, either conceptually or statistically. For example, it can be shown that each type of meMO model is statistically equivalent to a particular form of moderated mediation (moME), another process that examines the condition when the indirect effect from X to Y through M varies as a function of W. Consequently, it is difficult for one to differentiate these 2 processes mathematically. This study therefore has 2 objectives. First, we attempt to differentiate moME and meMO by proposing an alternative specification for meMO. Conceptually, this alternative specification is intuitively meaningful and interpretable, and, statistically, it offers meMO a unique representation that is no longer identical to its moME counterpart. Second, using structural equation modeling, we propose an integrated approach for the analysis of meMO as well as for other general types of conditional path models. VS, a computer software program that implements the proposed approach, has been developed to facilitate the analysis of conditional path models for applied researchers. Real examples are considered to illustrate how the proposed approach works in practice and to compare its performance against the traditional methods. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  12. Zero-state Markov switching count-data models: an empirical assessment.

    PubMed

    Malyshkina, Nataliya V; Mannering, Fred L

    2010-01-01

    In this study, a two-state Markov switching count-data model is proposed as an alternative to zero-inflated models to account for the preponderance of zeros sometimes observed in transportation count data, such as the number of accidents occurring on a roadway segment over some period of time. For this accident-frequency case, zero-inflated models assume the existence of two states: one of the states is a zero-accident count state, which has accident probabilities that are so low that they cannot be statistically distinguished from zero, and the other state is a normal-count state, in which counts can be non-negative integers that are generated by some counting process, for example, a Poisson or negative binomial. While zero-inflated models have come under some criticism with regard to accident-frequency applications - one fact is undeniable - in many applications they provide a statistically superior fit to the data. The Markov switching approach we propose seeks to overcome some of the criticism associated with the zero-accident state of the zero-inflated model by allowing individual roadway segments to switch between zero and normal-count states over time. An important advantage of this Markov switching approach is that it allows for the direct statistical estimation of the specific roadway-segment state (i.e., zero-accident or normal-count state) whereas traditional zero-inflated models do not. To demonstrate the applicability of this approach, a two-state Markov switching negative binomial model (estimated with Bayesian inference) and standard zero-inflated negative binomial models are estimated using five-year accident frequencies on Indiana interstate highway segments. It is shown that the Markov switching model is a viable alternative and results in a superior statistical fit relative to the zero-inflated models.

  13. 10 CFR 431.17 - Determination of efficiency.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... method or methods used; the mathematical model, the engineering or statistical analysis, computer... accordance with § 431.16 of this subpart, or by application of an alternative efficiency determination method... must be: (i) Derived from a mathematical model that represents the mechanical and electrical...

  14. IDENTIFICATION OF REGIME SHIFTS IN TIME SERIES USING NEIGHBORHOOD STATISTICS

    EPA Science Inventory

    The identification of alternative dynamic regimes in ecological systems requires several lines of evidence. Previous work on time series analysis of dynamic regimes includes mainly model-fitting methods. We introduce two methods that do not use models. These approaches use state-...

  15. Influences of credibility of testimony and strength of statistical evidence on children’s and adolescents’ reasoning

    PubMed Central

    Kail, Robert V.

    2013-01-01

    According to dual-process models that include analytic and heuristic modes of processing, analytic processing is often expected to become more common with development. Consistent with this view, on reasoning problems, adolescents are more likely than children to select alternatives that are backed by statistical evidence. It is shown here that this pattern depends on the quality of the statistical evidence and the quality of the testimonial that is the typical alternative to statistical evidence. In Experiment 1, 9- and 13-year-olds (N = 64) were presented with scenarios in which solid statistical evidence was contrasted with casual or expert testimonial evidence. When testimony was casual, children relied on it but adolescents did not; when testimony was expert, both children and adolescents relied on it. In Experiment 2, 9- and 13-year-olds (N = 83) were presented with scenarios in which casual testimonial evidence was contrasted with weak or strong statistical evidence. When statistical evidence was weak, children and adolescents relied on both testimonial and statistical evidence; when statistical evidence was strong, most children and adolescents relied on it. Results are discussed in terms of their implications for dual-process accounts of cognitive development. PMID:23735681

  16. The use of imputed sibling genotypes in sibship-based association analysis: on modeling alternatives, power and model misspecification.

    PubMed

    Minică, Camelia C; Dolan, Conor V; Hottenga, Jouke-Jan; Willemsen, Gonneke; Vink, Jacqueline M; Boomsma, Dorret I

    2013-05-01

    When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.

  17. Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models

    NASA Technical Reports Server (NTRS)

    Mjoisness, Eric; Castano, Rebecca; Gray, Alexander

    1999-01-01

    We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.

  18. A scan statistic for binary outcome based on hypergeometric probability model, with an application to detecting spatial clusters of Japanese encephalitis.

    PubMed

    Zhao, Xing; Zhou, Xiao-Hua; Feng, Zijian; Guo, Pengfei; He, Hongyan; Zhang, Tao; Duan, Lei; Li, Xiaosong

    2013-01-01

    As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.

  19. Goodness-of-Fit Assessment of Item Response Theory Models

    ERIC Educational Resources Information Center

    Maydeu-Olivares, Alberto

    2013-01-01

    The article provides an overview of goodness-of-fit assessment methods for item response theory (IRT) models. It is now possible to obtain accurate "p"-values of the overall fit of the model if bivariate information statistics are used. Several alternative approaches are described. As the validity of inferences drawn on the fitted model…

  20. Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance.

    PubMed

    Brase, Gary L; Vasserman, Eugene Y; Hsu, William

    2017-01-01

    Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings.

  1. Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance

    PubMed Central

    Brase, Gary L.; Vasserman, Eugene Y.; Hsu, William

    2017-01-01

    Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings. PMID:29163304

  2. A power comparison of generalized additive models and the spatial scan statistic in a case-control setting.

    PubMed

    Young, Robin L; Weinberg, Janice; Vieira, Verónica; Ozonoff, Al; Webster, Thomas F

    2010-07-19

    A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic.

  3. A power comparison of generalized additive models and the spatial scan statistic in a case-control setting

    PubMed Central

    2010-01-01

    Background A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. Results This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. Conclusions The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic. PMID:20642827

  4. Statistical methodology for the analysis of dye-switch microarray experiments

    PubMed Central

    Mary-Huard, Tristan; Aubert, Julie; Mansouri-Attia, Nadera; Sandra, Olivier; Daudin, Jean-Jacques

    2008-01-01

    Background In individually dye-balanced microarray designs, each biological sample is hybridized on two different slides, once with Cy3 and once with Cy5. While this strategy ensures an automatic correction of the gene-specific labelling bias, it also induces dependencies between log-ratio measurements that must be taken into account in the statistical analysis. Results We present two original statistical procedures for the statistical analysis of individually balanced designs. These procedures are compared with the usual ML and REML mixed model procedures proposed in most statistical toolboxes, on both simulated and real data. Conclusion The UP procedure we propose as an alternative to usual mixed model procedures is more efficient and significantly faster to compute. This result provides some useful guidelines for the analysis of complex designs. PMID:18271965

  5. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    USGS Publications Warehouse

    Furey, P.R.; Troutman, B.M.

    2008-01-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.

  6. An Alternate Definition of the ETS Delta Scale of Item Difficulty. Program Statistics Research.

    ERIC Educational Resources Information Center

    Holland, Paul W.; Thayer, Dorothy T.

    An alternative definition has been developed of the delta scale of item difficulty used at Educational Testing Service. The traditional delta scale uses an inverse normal transformation based on normal ogive models developed years ago. However, no use is made of this fact in typical uses of item deltas. It is simply one way to make the probability…

  7. How Statisticians Speak Risk

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Redus, K.S.

    2007-07-01

    The foundation of statistics deals with (a) how to measure and collect data and (b) how to identify models using estimates of statistical parameters derived from the data. Risk is a term used by the statistical community and those that employ statistics to express the results of a statistically based study. Statistical risk is represented as a probability that, for example, a statistical model is sufficient to describe a data set; but, risk is also interpreted as a measure of worth of one alternative when compared to another. The common thread of any risk-based problem is the combination of (a)more » the chance an event will occur, with (b) the value of the event. This paper presents an introduction to, and some examples of, statistical risk-based decision making from a quantitative, visual, and linguistic sense. This should help in understanding areas of radioactive waste management that can be suitably expressed using statistical risk and vice-versa. (authors)« less

  8. A call to improve methods for estimating tree biomass for regional and national assessments

    Treesearch

    Aaron R. Weiskittel; David W. MacFarlane; Philip J. Radtke; David L.R. Affleck; Hailemariam Temesgen; Christopher W. Woodall; James A. Westfall; John W. Coulston

    2015-01-01

    Tree biomass is typically estimated using statistical models. This review highlights five limitations of most tree biomass models, which include the following: (1) biomass data are costly to collect and alternative sampling methods are used; (2) belowground data and models are generally lacking; (3) models are often developed from small and geographically limited data...

  9. 10 CFR 431.197 - Manufacturer's determination of efficiency for distribution transformers.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... methods used; the mathematical model, the engineering or statistical analysis, computer simulation or... (b)(3) of this section, or by application of an alternative efficiency determination method (AEDM... section only if: (i) The AEDM has been derived from a mathematical model that represents the electrical...

  10. Alternative models of DSM-5 PTSD: Examining diagnostic implications.

    PubMed

    Murphy, Siobhan; Hansen, Maj; Elklit, Ask; Yong Chen, Yoke; Raudzah Ghazali, Siti; Shevlin, Mark

    2018-04-01

    The factor structure of DSM-5 posttraumatic stress disorder (PTSD) has been extensively debated with evidence supporting the recently proposed seven-factor Hybrid model. However, despite myriad studies examining PTSD symptom structure few have assessed the diagnostic implications of these proposed models. This study aimed to generate PTSD prevalence estimates derived from the 7 alternative factor models and assess whether pre-established risk factors associated with PTSD (e.g., transportation accidents and sexual victimisation) produce consistent risk estimates. Seven alternative models were estimated within a confirmatory factor analytic framework using the PTSD Checklist for DSM-5 (PCL-5). Data were analysed from a Malaysian adolescent community sample (n = 481) of which 61.7% were female, with a mean age of 17.03 years. The results indicated that all models provided satisfactory model fit with statistical superiority for the Externalising Behaviours and seven-factor Hybrid models. The PTSD prevalence estimates varied substantially ranging from 21.8% for the DSM-5 model to 10.0% for the Hybrid model. Estimates of risk associated with PTSD were inconsistent across the alternative models, with substantial variation emerging for sexual victimisation. These findings have important implications for research and practice and highlight that more research attention is needed to examine the diagnostic implications emerging from the alternative models of PTSD. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. An Extension of RSS-based Model Comparison Tests for Weighted Least Squares

    DTIC Science & Technology

    2012-08-22

    use the model comparison test statistic to analyze the null hypothesis. Under the null hypothesis, the weighted least squares cost functional is JWLS ...q̂WLSH ) = 10.3040×106. Under the alternative hypothesis, the weighted least squares cost functional is JWLS (q̂WLS) = 8.8394 × 106. Thus the model

  12. Influences of credibility of testimony and strength of statistical evidence on children's and adolescents' reasoning.

    PubMed

    Kail, Robert V

    2013-11-01

    According to dual-process models that include analytic and heuristic modes of processing, analytic processing is often expected to become more common with development. Consistent with this view, on reasoning problems, adolescents are more likely than children to select alternatives that are backed by statistical evidence. It is shown here that this pattern depends on the quality of the statistical evidence and the quality of the testimonial that is the typical alternative to statistical evidence. In Experiment 1, 9- and 13-year-olds (N=64) were presented with scenarios in which solid statistical evidence was contrasted with casual or expert testimonial evidence. When testimony was casual, children relied on it but adolescents did not; when testimony was expert, both children and adolescents relied on it. In Experiment 2, 9- and 13-year-olds (N=83) were presented with scenarios in which casual testimonial evidence was contrasted with weak or strong statistical evidence. When statistical evidence was weak, children and adolescents relied on both testimonial and statistical evidence; when statistical evidence was strong, most children and adolescents relied on it. Results are discussed in terms of their implications for dual-process accounts of cognitive development. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Modelling unsupervised online-learning of artificial grammars: linking implicit and statistical learning.

    PubMed

    Rohrmeier, Martin A; Cross, Ian

    2014-07-01

    Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Coupled local facilitation and global hydrologic inhibition drive landscape geometry in a patterned peatland

    NASA Astrophysics Data System (ADS)

    Acharya, S.; Kaplan, D. A.; Casey, S.; Cohen, M. J.; Jawitz, J. W.

    2015-05-01

    Self-organized landscape patterning can arise in response to multiple processes. Discriminating among alternative patterning mechanisms, particularly where experimental manipulations are untenable, requires process-based models. Previous modeling studies have attributed patterning in the Everglades (Florida, USA) to sediment redistribution and anisotropic soil hydraulic properties. In this work, we tested an alternate theory, the self-organizing-canal (SOC) hypothesis, by developing a cellular automata model that simulates pattern evolution via local positive feedbacks (i.e., facilitation) coupled with a global negative feedback based on hydrology. The model is forced by global hydroperiod that drives stochastic transitions between two patch types: ridge (higher elevation) and slough (lower elevation). We evaluated model performance using multiple criteria based on six statistical and geostatistical properties observed in reference portions of the Everglades landscape: patch density, patch anisotropy, semivariogram ranges, power-law scaling of ridge areas, perimeter area fractal dimension, and characteristic pattern wavelength. Model results showed strong statistical agreement with reference landscapes, but only when anisotropically acting local facilitation was coupled with hydrologic global feedback, for which several plausible mechanisms exist. Critically, the model correctly generated fractal landscapes that had no characteristic pattern wavelength, supporting the invocation of global rather than scale-specific negative feedbacks.

  15. Coupled local facilitation and global hydrologic inhibition drive landscape geometry in a patterned peatland

    NASA Astrophysics Data System (ADS)

    Acharya, S.; Kaplan, D. A.; Casey, S.; Cohen, M. J.; Jawitz, J. W.

    2015-01-01

    Self-organized landscape patterning can arise in response to multiple processes. Discriminating among alternative patterning mechanisms, particularly where experimental manipulations are untenable, requires process-based models. Previous modeling studies have attributed patterning in the Everglades (Florida, USA) to sediment redistribution and anisotropic soil hydraulic properties. In this work, we tested an alternate theory, the self-organizing canal (SOC) hypothesis, by developing a cellular automata model that simulates pattern evolution via local positive feedbacks (i.e., facilitation) coupled with a global negative feedback based on hydrology. The model is forced by global hydroperiod that drives stochastic transitions between two patch types: ridge (higher elevation) and slough (lower elevation). We evaluated model performance using multiple criteria based on six statistical and geostatistical properties observed in reference portions of the Everglades landscape: patch density, patch anisotropy, semivariogram ranges, power-law scaling of ridge areas, perimeter area fractal dimension, and characteristic pattern wavelength. Model results showed strong statistical agreement with reference landscapes, but only when anisotropically acting local facilitation was coupled with hydrologic global feedback, for which several plausible mechanisms exist. Critically, the model correctly generated fractal landscapes that had no characteristic pattern wavelength, supporting the invocation of global rather than scale-specific negative feedbacks.

  16. Statistical validity of using ratio variables in human kinetics research.

    PubMed

    Liu, Yuanlong; Schutz, Robert W

    2003-09-01

    The purposes of this study were to investigate the validity of the simple ratio and three alternative deflation models and examine how the variation of the numerator and denominator variables affects the reliability of a ratio variable. A simple ratio and three alternative deflation models were fitted to four empirical data sets, and common criteria were applied to determine the best model for deflation. Intraclass correlation was used to examine the component effect on the reliability of a ratio variable. The results indicate that the validity, of a deflation model depends on the statistical characteristics of the particular component variables used, and an optimal deflation model for all ratio variables may not exist. Therefore, it is recommended that different models be fitted to each empirical data set to determine the best deflation model. It was found that the reliability of a simple ratio is affected by the coefficients of variation and the within- and between-trial correlations between the numerator and denominator variables. It was recommended that researchers should compute the reliability of the derived ratio scores and not assume that strong reliabilities in the numerator and denominator measures automatically lead to high reliability in the ratio measures.

  17. Modeling the sound transmission between rooms coupled through partition walls by using a diffusion model.

    PubMed

    Billon, Alexis; Foy, Cédric; Picaut, Judicaël; Valeau, Vincent; Sakout, Anas

    2008-06-01

    In this paper, a modification of the diffusion model for room acoustics is proposed to account for sound transmission between two rooms, a source room and an adjacent room, which are coupled through a partition wall. A system of two diffusion equations, one for each room, together with a set of two boundary conditions, one for the partition wall and one for the other walls of a room, is obtained and numerically solved. The modified diffusion model is validated by numerical comparisons with the statistical theory for several coupled-room configurations by varying the coupling area surface, the absorption coefficient of each room, and the volume of the adjacent room. An experimental comparison is also carried out for two coupled classrooms. The modified diffusion model results agree very well with both the statistical theory and the experimental data. The diffusion model can then be used as an alternative to the statistical theory, especially when the statistical theory is not applicable, that is, when the reverberant sound field is not diffuse. Moreover, the diffusion model allows the prediction of the spatial distribution of sound energy within each coupled room, while the statistical theory gives only one sound level for each room.

  18. Contrasting support for alternative models of genomic variation based on microhabitat preference: species-specific effects of climate change in alpine sedges.

    PubMed

    Massatti, Rob; Knowles, L Lacey

    2016-08-01

    Deterministic processes may uniquely affect codistributed species' phylogeographic patterns such that discordant genetic variation among taxa is predicted. Yet, explicitly testing expectations of genomic discordance in a statistical framework remains challenging. Here, we construct spatially and temporally dynamic models to investigate the hypothesized effect of microhabitat preferences on the permeability of glaciated regions to gene flow in two closely related montane species. Utilizing environmental niche models from the Last Glacial Maximum and the present to inform demographic models of changes in habitat suitability over time, we evaluate the relative probabilities of two alternative models using approximate Bayesian computation (ABC) in which glaciated regions are either (i) permeable or (ii) a barrier to gene flow. Results based on the fit of the empirical data to data sets simulated using a spatially explicit coalescent under alternative models indicate that genomic data are consistent with predictions about the hypothesized role of microhabitat in generating discordant patterns of genetic variation among the taxa. Specifically, a model in which glaciated areas acted as a barrier was much more probable based on patterns of genomic variation in Carex nova, a wet-adapted species. However, in the dry-adapted Carex chalciolepis, the permeable model was more probable, although the difference in the support of the models was small. This work highlights how statistical inferences can be used to distinguish deterministic processes that are expected to result in discordant genomic patterns among species, including species-specific responses to climate change. © 2016 John Wiley & Sons Ltd.

  19. Estimating inverse probability weights using super learner when weight-model specification is unknown in a marginal structural Cox model context.

    PubMed

    Karim, Mohammad Ehsanul; Platt, Robert W

    2017-06-15

    Correct specification of the inverse probability weighting (IPW) model is necessary for consistent inference from a marginal structural Cox model (MSCM). In practical applications, researchers are typically unaware of the true specification of the weight model. Nonetheless, IPWs are commonly estimated using parametric models, such as the main-effects logistic regression model. In practice, assumptions underlying such models may not hold and data-adaptive statistical learning methods may provide an alternative. Many candidate statistical learning approaches are available in the literature. However, the optimal approach for a given dataset is impossible to predict. Super learner (SL) has been proposed as a tool for selecting an optimal learner from a set of candidates using cross-validation. In this study, we evaluate the usefulness of a SL in estimating IPW in four different MSCM simulation scenarios, in which we varied the specification of the true weight model specification (linear and/or additive). Our simulations show that, in the presence of weight model misspecification, with a rich and diverse set of candidate algorithms, SL can generally offer a better alternative to the commonly used statistical learning approaches in terms of MSE as well as the coverage probabilities of the estimated effect in an MSCM. The findings from the simulation studies guided the application of the MSCM in a multiple sclerosis cohort from British Columbia, Canada (1995-2008), to estimate the impact of beta-interferon treatment in delaying disability progression. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  20. Statistical mechanics model for the emergence of consensus

    NASA Astrophysics Data System (ADS)

    Raffaelli, Giacomo; Marsili, Matteo

    2005-07-01

    The statistical properties of pairwise majority voting over S alternatives are analyzed in an infinite random population. We first compute the probability that the majority is transitive (i.e., that if it prefers A to B to C , then it prefers A to C ) and then study the case of an interacting population. This is described by a constrained multicomponent random field Ising model whose ferromagnetic phase describes the emergence of a strong transitive majority. We derive the phase diagram, which is characterized by a tricritical point and show that, contrary to intuition, it may be more likely for an interacting population to reach consensus on a number S of alternatives when S increases. This effect is due to the constraint imposed by transitivity on voting behavior. Indeed if agents are allowed to express nontransitive votes, the agents’ interaction may decrease considerably the probability of a transitive majority.

  1. Formulating appropriate statistical hypotheses for treatment comparison in clinical trial design and analysis.

    PubMed

    Huang, Peng; Ou, Ai-hua; Piantadosi, Steven; Tan, Ming

    2014-11-01

    We discuss the problem of properly defining treatment superiority through the specification of hypotheses in clinical trials. The need to precisely define the notion of superiority in a one-sided hypothesis test problem has been well recognized by many authors. Ideally designed null and alternative hypotheses should correspond to a partition of all possible scenarios of underlying true probability models P={P(ω):ω∈Ω} such that the alternative hypothesis Ha={P(ω):ω∈Ωa} can be inferred upon the rejection of null hypothesis Ho={P(ω):ω∈Ω(o)} However, in many cases, tests are carried out and recommendations are made without a precise definition of superiority or a specification of alternative hypothesis. Moreover, in some applications, the union of probability models specified by the chosen null and alternative hypothesis does not constitute a completed model collection P (i.e., H(o)∪H(a) is smaller than P). This not only imposes a strong non-validated assumption of the underlying true models, but also leads to different superiority claims depending on which test is used instead of scientific plausibility. Different ways to partition P fro testing treatment superiority often have different implications on sample size, power, and significance in both efficacy and comparative effectiveness trial design. Such differences are often overlooked. We provide a theoretical framework for evaluating the statistical properties of different specification of superiority in typical hypothesis testing. This can help investigators to select proper hypotheses for treatment comparison inclinical trial design. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Statistical Emulation of Climate Model Projections Based on Precomputed GCM Runs*

    DOE PAGES

    Castruccio, Stefano; McInerney, David J.; Stein, Michael L.; ...

    2014-02-24

    The authors describe a new approach for emulating the output of a fully coupled climate model under arbitrary forcing scenarios that is based on a small set of precomputed runs from the model. Temperature and precipitation are expressed as simple functions of the past trajectory of atmospheric CO 2 concentrations, and a statistical model is fit using a limited set of training runs. The approach is demonstrated to be a useful and computationally efficient alternative to pattern scaling and captures the nonlinear evolution of spatial patterns of climate anomalies inherent in transient climates. The approach does as well as patternmore » scaling in all circumstances and substantially better in many; it is not computationally demanding; and, once the statistical model is fit, it produces emulated climate output effectively instantaneously. In conclusion, it may therefore find wide application in climate impacts assessments and other policy analyses requiring rapid climate projections.« less

  3. Comparison of hypertabastic survival model with other unimodal hazard rate functions using a goodness-of-fit test.

    PubMed

    Tahir, M Ramzan; Tran, Quang X; Nikulin, Mikhail S

    2017-05-30

    We studied the problem of testing a hypothesized distribution in survival regression models when the data is right censored and survival times are influenced by covariates. A modified chi-squared type test, known as Nikulin-Rao-Robson statistic, is applied for the comparison of accelerated failure time models. This statistic is used to test the goodness-of-fit for hypertabastic survival model and four other unimodal hazard rate functions. The results of simulation study showed that the hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution. In statistical modeling, because of its flexible shape of hazard functions, this distribution can also be used as a competitor of Birnbaum-Saunders and inverse Gaussian distributions. The results for the real data application are shown. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  4. Hypothesis testing of a change point during cognitive decline among Alzheimer's disease patients.

    PubMed

    Ji, Ming; Xiong, Chengjie; Grundman, Michael

    2003-10-01

    In this paper, we present a statistical hypothesis test for detecting a change point over the course of cognitive decline among Alzheimer's disease patients. The model under the null hypothesis assumes a constant rate of cognitive decline over time and the model under the alternative hypothesis is a general bilinear model with an unknown change point. When the change point is unknown, however, the null distribution of the test statistics is not analytically tractable and has to be simulated by parametric bootstrap. When the alternative hypothesis that a change point exists is accepted, we propose an estimate of its location based on the Akaike's Information Criterion. We applied our method to a data set from the Neuropsychological Database Initiative by implementing our hypothesis testing method to analyze Mini Mental Status Exam scores based on a random-slope and random-intercept model with a bilinear fixed effect. Our result shows that despite large amount of missing data, accelerated decline did occur for MMSE among AD patients. Our finding supports the clinical belief of the existence of a change point during cognitive decline among AD patients and suggests the use of change point models for the longitudinal modeling of cognitive decline in AD research.

  5. Statistical methods for investigating quiescence and other temporal seismicity patterns

    USGS Publications Warehouse

    Matthews, M.V.; Reasenberg, P.A.

    1988-01-01

    We propose a statistical model and a technique for objective recognition of one of the most commonly cited seismicity patterns:microearthquake quiescence. We use a Poisson process model for seismicity and define a process with quiescence as one with a particular type of piece-wise constant intensity function. From this model, we derive a statistic for testing stationarity against a 'quiescence' alternative. The large-sample null distribution of this statistic is approximated from simulated distributions of appropriate functionals applied to Brownian bridge processes. We point out the restrictiveness of the particular model we propose and of the quiescence idea in general. The fact that there are many point processes which have neither constant nor quiescent rate functions underscores the need to test for and describe nonuniformity thoroughly. We advocate the use of the quiescence test in conjunction with various other tests for nonuniformity and with graphical methods such as density estimation. ideally these methods may promote accurate description of temporal seismicity distributions and useful characterizations of interesting patterns. ?? 1988 Birkha??user Verlag.

  6. Performance of Reclassification Statistics in Comparing Risk Prediction Models

    PubMed Central

    Paynter, Nina P.

    2012-01-01

    Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information. PMID:21294152

  7. The MAX Statistic is Less Powerful for Genome Wide Association Studies Under Most Alternative Hypotheses.

    PubMed

    Shifflett, Benjamin; Huang, Rong; Edland, Steven D

    2017-01-01

    Genotypic association studies are prone to inflated type I error rates if multiple hypothesis testing is performed, e.g., sequentially testing for recessive, multiplicative, and dominant risk. Alternatives to multiple hypothesis testing include the model independent genotypic χ 2 test, the efficiency robust MAX statistic, which corrects for multiple comparisons but with some loss of power, or a single Armitage test for multiplicative trend, which has optimal power when the multiplicative model holds but with some loss of power when dominant or recessive models underlie the genetic association. We used Monte Carlo simulations to describe the relative performance of these three approaches under a range of scenarios. All three approaches maintained their nominal type I error rates. The genotypic χ 2 and MAX statistics were more powerful when testing a strictly recessive genetic effect or when testing a dominant effect when the allele frequency was high. The Armitage test for multiplicative trend was most powerful for the broad range of scenarios where heterozygote risk is intermediate between recessive and dominant risk. Moreover, all tests had limited power to detect recessive genetic risk unless the sample size was large, and conversely all tests were relatively well powered to detect dominant risk. Taken together, these results suggest the general utility of the multiplicative trend test when the underlying genetic model is unknown.

  8. Probabilistic Evaluation of Competing Climate Models

    NASA Astrophysics Data System (ADS)

    Braverman, A. J.; Chatterjee, S.; Heyman, M.; Cressie, N.

    2017-12-01

    A standard paradigm for assessing the quality of climate model simulations is to compare what these models produce for past and present time periods, to observations of the past and present. Many of these comparisons are based on simple summary statistics called metrics. Here, we propose an alternative: evaluation of competing climate models through probabilities derived from tests of the hypothesis that climate-model-simulated and observed time sequences share common climate-scale signals. The probabilities are based on the behavior of summary statistics of climate model output and observational data, over ensembles of pseudo-realizations. These are obtained by partitioning the original time sequences into signal and noise components, and using a parametric bootstrap to create pseudo-realizations of the noise sequences. The statistics we choose come from working in the space of decorrelated and dimension-reduced wavelet coefficients. We compare monthly sequences of CMIP5 model output of average global near-surface temperature anomalies to similar sequences obtained from the well-known HadCRUT4 data set, as an illustration.

  9. Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.

    PubMed

    Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A

    2010-11-01

    The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.

  10. Shallow Turbulence in Rivers and Estuaries

    DTIC Science & Technology

    2012-09-30

    objectives are to: 1. Determine spatial patterns of shallow turbulence from in-situ and remote sensing data and investigate the effects and...production through a model parameter study, and determine the optimal model configuration that statistically reproduces the shallow turbulence...more probable cause. According to Nezu et al. (1993), longitudinal vorticity streets would cause alternating upwelling (boils) and down welling

  11. Utilization of Lymphoblastoid Cell Lines as a System for the Molecular Modeling of Autism

    ERIC Educational Resources Information Center

    Baron, Colin A.; Liu, Stephenie Y.; Hicks, Chindo; Gregg, Jeffrey P.

    2006-01-01

    In order to provide an alternative approach for understanding the biology and genetics of autism, we performed statistical analysis of gene expression profiles of lymphoblastoid cell lines derived from children with autism and their families. The goal was to assess the feasibility of using this model in identifying autism-associated genes.…

  12. Statistical distribution of mechanical properties for three graphite-epoxy material systems

    NASA Technical Reports Server (NTRS)

    Reese, C.; Sorem, J., Jr.

    1981-01-01

    Graphite-epoxy composites are playing an increasing role as viable alternative materials in structural applications necessitating thorough investigation into the predictability and reproducibility of their material strength properties. This investigation was concerned with tension, compression, and short beam shear coupon testing of large samples from three different material suppliers to determine their statistical strength behavior. Statistical results indicate that a two Parameter Weibull distribution model provides better overall characterization of material behavior for the graphite-epoxy systems tested than does the standard Normal distribution model that is employed for most design work. While either a Weibull or Normal distribution model provides adequate predictions for average strength values, the Weibull model provides better characterization in the lower tail region where the predictions are of maximum design interest. The two sets of the same material were found to have essentially the same material properties, and indicate that repeatability can be achieved.

  13. Empirical support for global integrated assessment modeling: Productivity trends and technological change in developing countries' agriculture and electric power sectors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sathaye, Jayant A.

    2000-04-01

    Integrated assessment (IA) modeling of climate policy is increasingly global in nature, with models incorporating regional disaggregation. The existing empirical basis for IA modeling, however, largely arises from research on industrialized economies. Given the growing importance of developing countries in determining long-term global energy and carbon emissions trends, filling this gap with improved statistical information on developing countries' energy and carbon-emissions characteristics is an important priority for enhancing IA modeling. Earlier research at LBNL on this topic has focused on assembling and analyzing statistical data on productivity trends and technological change in the energy-intensive manufacturing sectors of five developing countries,more » India, Brazil, Mexico, Indonesia, and South Korea. The proposed work will extend this analysis to the agriculture and electric power sectors in India, South Korea, and two other developing countries. They will also examine the impact of alternative model specifications on estimates of productivity growth and technological change for each of the three sectors, and estimate the contribution of various capital inputs--imported vs. indigenous, rigid vs. malleable-- in contributing to productivity growth and technological change. The project has already produced a data resource on the manufacturing sector which is being shared with IA modelers. This will be extended to the agriculture and electric power sectors, which would also be made accessible to IA modeling groups seeking to enhance the empirical descriptions of developing country characteristics. The project will entail basic statistical and econometric analysis of productivity and energy trends in these developing country sectors, with parameter estimates also made available to modeling groups. The parameter estimates will be developed using alternative model specifications that could be directly utilized by the existing IAMs for the manufacturing, agriculture, and electric power sectors.« less

  14. A Weibull statistics-based lignocellulose saccharification model and a built-in parameter accurately predict lignocellulose hydrolysis performance.

    PubMed

    Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu

    2015-09-01

    Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. The researcher and the consultant: from testing to probability statements.

    PubMed

    Hamra, Ghassan B; Stang, Andreas; Poole, Charles

    2015-09-01

    In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.

  16. Rational approximations to rational models: alternative algorithms for category learning.

    PubMed

    Sanborn, Adam N; Griffiths, Thomas L; Navarro, Daniel J

    2010-10-01

    Rational models of cognition typically consider the abstract computational problems posed by the environment, assuming that people are capable of optimally solving those problems. This differs from more traditional formal models of cognition, which focus on the psychological processes responsible for behavior. A basic challenge for rational models is thus explaining how optimal solutions can be approximated by psychological processes. We outline a general strategy for answering this question, namely to explore the psychological plausibility of approximation algorithms developed in computer science and statistics. In particular, we argue that Monte Carlo methods provide a source of rational process models that connect optimal solutions to psychological processes. We support this argument through a detailed example, applying this approach to Anderson's (1990, 1991) rational model of categorization (RMC), which involves a particularly challenging computational problem. Drawing on a connection between the RMC and ideas from nonparametric Bayesian statistics, we propose 2 alternative algorithms for approximate inference in this model. The algorithms we consider include Gibbs sampling, a procedure appropriate when all stimuli are presented simultaneously, and particle filters, which sequentially approximate the posterior distribution with a small number of samples that are updated as new data become available. Applying these algorithms to several existing datasets shows that a particle filter with a single particle provides a good description of human inferences.

  17. Unbiased split variable selection for random survival forests using maximally selected rank statistics.

    PubMed

    Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas

    2017-04-15

    The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  18. Estimating economic thresholds for pest control: an alternative procedure.

    PubMed

    Ramirez, O A; Saunders, J L

    1999-04-01

    An alternative methodology to determine profit maximizing economic thresholds is developed and illustrated. An optimization problem based on the main biological and economic relations involved in determining a profit maximizing economic threshold is first advanced. From it, a more manageable model of 2 nonsimultaneous reduced-from equations is derived, which represents a simpler but conceptually and statistically sound alternative. The model recognizes that yields and pest control costs are a function of the economic threshold used. Higher (less strict) economic thresholds can result in lower yields and, therefore, a lower gross income from the sale of the product, but could also be less costly to maintain. The highest possible profits will be obtained by using the economic threshold that results in a maximum difference between gross income and pest control cost functions.

  19. Do labeled versus unlabeled treatments of alternatives' names influence stated choice outputs? Results from a mode choice study.

    PubMed

    Jin, Wen; Jiang, Hai; Liu, Yimin; Klampfl, Erica

    2017-01-01

    Discrete choice experiments have been widely applied to elicit behavioral preferences in the literature. In many of these experiments, the alternatives are named alternatives, meaning that they are naturally associated with specific names. For example, in a mode choice study, the alternatives can be associated with names such as car, taxi, bus, and subway. A fundamental issue that arises in stated choice experiments is whether to treat the alternatives' names as labels (that is, labeled treatment), or as attributes (that is, unlabeled treatment) in the design as well as the presentation phases of the choice sets. In this research, we investigate the impact of labeled versus unlabeled treatments of alternatives' names on the outcome of stated choice experiments, a question that has not been thoroughly investigated in the literature. Using results from a mode choice study, we find that the labeled or the unlabeled treatment of alternatives' names in either the design or the presentation phase of the choice experiment does not statistically affect the estimates of the coefficient parameters. We then proceed to measure the influence toward the willingness-to-pay (WTP) estimates. By using a random-effects model to relate the conditional WTP estimates to the socioeconomic characteristics of the individuals and the labeled versus unlabeled treatments of alternatives' names, we find that: a) Given the treatment of alternatives' names in the presentation phase, the treatment of alternatives' names in the design phase does not statistically affect the estimates of the WTP measures; and b) Given the treatment of alternatives' names in the design phase, the labeled treatment of alternatives' names in the presentation phase causes the corresponding WTP estimates to be slightly higher.

  20. Gene-expression programming for flip-bucket spillway scour.

    PubMed

    Guven, Aytac; Azamathulla, H Md

    2012-01-01

    During the last two decades, researchers have noticed that the use of soft computing techniques as an alternative to conventional statistical methods based on controlled laboratory or field data, gave significantly better results. Gene-expression programming (GEP), which is an extension to genetic programming (GP), has nowadays attracted the attention of researchers in prediction of hydraulic data. This study presents GEP as an alternative tool in the prediction of scour downstream of a flip-bucket spillway. Actual field measurements were used to develop GEP models. The proposed GEP models are compared with the earlier conventional GP results of others (Azamathulla et al. 2008b; RMSE = 2.347, δ = 0.377, R = 0.842) and those of commonly used regression-based formulae. The predictions of GEP models were observed to be in strictly good agreement with measured ones, and quite a bit better than conventional GP and the regression-based formulae. The results are tabulated in terms of statistical error measures (GEP1; RMSE = 1.596, δ = 0.109, R = 0.917) and illustrated via scatter plots.

  1. Sexual network drivers of HIV and herpes simplex virus type 2 transmission

    PubMed Central

    Omori, Ryosuke; Abu-Raddad, Laith J.

    2017-01-01

    Objectives: HIV and herpes simplex virus type 2 (HSV-2) infections are sexually transmitted and propagate in sexual networks. Using mathematical modeling, we aimed to quantify effects of key network statistics on infection transmission, and extent to which HSV-2 prevalence can be a proxy of HIV prevalence. Design/methods: An individual-based simulation model was constructed to describe sex partnering and infection transmission, and was parameterized with representative natural history, transmission, and sexual behavior data. Correlations were assessed on model outcomes (HIV/HSV-2 prevalences) and multiple linear regressions were conducted to estimate adjusted associations and effect sizes. Results: HIV prevalence was one-third or less of HSV-2 prevalence. HIV and HSV-2 prevalences were associated with a Spearman's rank correlation coefficient of 0.64 (95% confidence interval: 0.58–0.69). Collinearities among network statistics were detected, most notably between concurrency versus mean and variance of number of partners. Controlling for confounding, unmarried mean/variance of number of partners (or alternatively concurrency) were the strongest predictors of HIV prevalence. Meanwhile, unmarried/married mean/variance of number of partners (or alternatively concurrency), and clustering coefficient were the strongest predictors of HSV-2 prevalence. HSV-2 prevalence was a strong predictor of HIV prevalence by proxying effects of network statistics. Conclusion: Network statistics produced similar and differential effects on HIV/HSV-2 transmission, and explained most of the variation in HIV and HSV-2 prevalences. HIV prevalence reflected primarily mean and variance of number of partners, but HSV-2 prevalence was affected by a range of network statistics. HSV-2 prevalence (as a proxy) can forecast a population's HIV epidemic potential, thereby informing interventions. PMID:28514276

  2. How to interpret the results of medical time series data analysis: Classical statistical approaches versus dynamic Bayesian network modeling.

    PubMed

    Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall

    2016-01-01

    Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.

  3. Cancer Survival Estimates Due to Non-Uniform Loss to Follow-Up and Non-Proportional Hazards

    PubMed

    K M, Jagathnath Krishna; Mathew, Aleyamma; Sara George, Preethi

    2017-06-25

    Background: Cancer survival depends on loss to follow-up (LFU) and non-proportional hazards (non-PH). If LFU is high, survival will be over-estimated. If hazard is non-PH, rank tests will provide biased inference and Cox-model will provide biased hazard-ratio. We assessed the bias due to LFU and non-PH factor in cancer survival and provided alternate methods for unbiased inference and hazard-ratio. Materials and Methods: Kaplan-Meier survival were plotted using a realistic breast cancer (BC) data-set, with >40%, 5-year LFU and compared it using another BC data-set with <15%, 5-year LFU to assess the bias in survival due to high LFU. Age at diagnosis of the latter data set was used to illustrate the bias due to a non-PH factor. Log-rank test was employed to assess the bias in p-value and Cox-model was used to assess the bias in hazard-ratio for the non-PH factor. Schoenfeld statistic was used to test the non-PH of age. For the non-PH factor, we employed Renyi statistic for inference and time dependent Cox-model for hazard-ratio. Results: Five-year BC survival was 69% (SE: 1.1%) vs. 90% (SE: 0.7%) for data with low vs. high LFU respectively. Age (<45, 46-54 & >54 years) was a non-PH factor (p-value: 0.036). However, survival by age was significant (log-rank p-value: 0.026), but not significant using Renyi statistic (p=0.067). Hazard ratio (HR) for age using Cox-model was 1.012 (95%CI: 1.004 -1.019) and the same using time-dependent Cox-model was in the other direction (HR: 0.997; 95% CI: 0.997- 0.998). Conclusion: Over-estimated survival was observed for cancer with high LFU. Log-rank statistic and Cox-model provided biased results for non-PH factor. For data with non-PH factors, Renyi statistic and time dependent Cox-model can be used as alternate methods to obtain unbiased inference and estimates. Creative Commons Attribution License

  4. Comparison of LIDAR system performance for alternative single-mode receiver architectures: modeling and experimental validation

    NASA Astrophysics Data System (ADS)

    Toliver, Paul; Ozdur, Ibrahim; Agarwal, Anjali; Woodward, T. K.

    2013-05-01

    In this paper, we describe a detailed performance comparison of alternative single-pixel, single-mode LIDAR architectures including (i) linear-mode APD-based direct-detection, (ii) optically-preamplified PIN receiver, (iii) PINbased coherent-detection, and (iv) Geiger-mode single-photon-APD counting. Such a comparison is useful when considering next-generation LIDAR on a chip, which would allow one to leverage extensive waveguide-based structures and processing elements developed for telecom and apply them to small form-factor sensing applications. Models of four LIDAR transmit and receive systems are described in detail, which include not only the dominant sources of receiver noise commonly assumed in each of the four detection limits, but also additional noise terms present in realistic implementations. These receiver models are validated through the analysis of detection statistics collected from an experimental LIDAR testbed. The receiver is reconfigurable into four modes of operation, while transmit waveforms and channel characteristics are held constant. The use of a diffuse hard target highlights the importance of including speckle noise terms in the overall system analysis. All measurements are done at 1550 nm, which offers multiple system advantages including less stringent eye safety requirements and compatibility with available telecom components, optical amplification, and photonic integration. Ultimately, the experimentally-validated detection statistics can be used as part of an end-to-end system model for projecting rate, range, and resolution performance limits and tradeoffs of alternative integrated LIDAR architectures.

  5. Statistics of voids in hierarchical universes

    NASA Technical Reports Server (NTRS)

    Fry, J. N.

    1986-01-01

    As one alternative to the N-point galaxy correlation function statistics, the distribution of holes or the probability that a volume of given size and shape be empty of galaxies can be considered. The probability of voids resulting from a variety of hierarchical patterns of clustering is considered, and these are compared with the results of numerical simulations and with observations. A scaling relation required by the hierarchical pattern of higher order correlation functions is seen to be obeyed in the simulations, and the numerical results show a clear difference between neutrino models and cold-particle models; voids are more likely in neutrino universes. Observational data do not yet distinguish but are close to being able to distinguish between models.

  6. Optimal region of latching activity in an adaptive Potts model for networks of neurons

    NASA Astrophysics Data System (ADS)

    Abdollah-nia, Mohammad-Farshad; Saeedghalati, Mohammadkarim; Abbassian, Abdolhossein

    2012-02-01

    In statistical mechanics, the Potts model is a model for interacting spins with more than two discrete states. Neural networks which exhibit features of learning and associative memory can also be modeled by a system of Potts spins. A spontaneous behavior of hopping from one discrete attractor state to another (referred to as latching) has been proposed to be associated with higher cognitive functions. Here we propose a model in which both the stochastic dynamics of Potts models and an adaptive potential function are present. A latching dynamics is observed in a limited region of the noise(temperature)-adaptation parameter space. We hence suggest noise as a fundamental factor in such alternations alongside adaptation. From a dynamical systems point of view, the noise-adaptation alternations may be the underlying mechanism for multi-stability in attractor-based models. An optimality criterion for realistic models is finally inferred.

  7. Score tests for independence in semiparametric competing risks models.

    PubMed

    Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul

    2009-12-01

    A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.

  8. Twice random, once mixed: applying mixed models to simultaneously analyze random effects of language and participants.

    PubMed

    Janssen, Dirk P

    2012-03-01

    Psychologists, psycholinguists, and other researchers using language stimuli have been struggling for more than 30 years with the problem of how to analyze experimental data that contain two crossed random effects (items and participants). The classical analysis of variance does not apply; alternatives have been proposed but have failed to catch on, and a statistically unsatisfactory procedure of using two approximations (known as F(1) and F(2)) has become the standard. A simple and elegant solution using mixed model analysis has been available for 15 years, and recent improvements in statistical software have made mixed models analysis widely available. The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages. The article also introduces the DJMIXED: add-on package for SPSS, which makes entering the models and reporting their results as straightforward as possible.

  9. Modeling Count Outcomes from HIV Risk Reduction Interventions: A Comparison of Competing Statistical Models for Count Responses

    PubMed Central

    Xia, Yinglin; Morrison-Beedy, Dianne; Ma, Jingming; Feng, Changyong; Cross, Wendi; Tu, Xin

    2012-01-01

    Modeling count data from sexual behavioral outcomes involves many challenges, especially when the data exhibit a preponderance of zeros and overdispersion. In particular, the popular Poisson log-linear model is not appropriate for modeling such outcomes. Although alternatives exist for addressing both issues, they are not widely and effectively used in sex health research, especially in HIV prevention intervention and related studies. In this paper, we discuss how to analyze count outcomes distributed with excess of zeros and overdispersion and introduce appropriate model-fit indices for comparing the performance of competing models, using data from a real study on HIV prevention intervention. The in-depth look at these common issues arising from studies involving behavioral outcomes will promote sound statistical analyses and facilitate research in this and other related areas. PMID:22536496

  10. Statistical emulators of maize, rice, soybean and wheat yields from global gridded crop models

    DOE PAGES

    Blanc, Élodie

    2017-01-26

    This study provides statistical emulators of crop yields based on global gridded crop model simulations from the Inter-Sectoral Impact Model Intercomparison Project Fast Track project. The ensemble of simulations is used to build a panel of annual crop yields from five crop models and corresponding monthly summer weather variables for over a century at the grid cell level globally. This dataset is then used to estimate, for each crop and gridded crop model, the statistical relationship between yields, temperature, precipitation and carbon dioxide. This study considers a new functional form to better capture the non-linear response of yields to weather,more » especially for extreme temperature and precipitation events, and now accounts for the effect of soil type. In- and out-of-sample validations show that the statistical emulators are able to replicate spatial patterns of yields crop levels and changes overtime projected by crop models reasonably well, although the accuracy of the emulators varies by model and by region. This study therefore provides a reliable and accessible alternative to global gridded crop yield models. By emulating crop yields for several models using parsimonious equations, the tools provide a computationally efficient method to account for uncertainty in climate change impact assessments.« less

  11. Statistical emulators of maize, rice, soybean and wheat yields from global gridded crop models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Blanc, Élodie

    This study provides statistical emulators of crop yields based on global gridded crop model simulations from the Inter-Sectoral Impact Model Intercomparison Project Fast Track project. The ensemble of simulations is used to build a panel of annual crop yields from five crop models and corresponding monthly summer weather variables for over a century at the grid cell level globally. This dataset is then used to estimate, for each crop and gridded crop model, the statistical relationship between yields, temperature, precipitation and carbon dioxide. This study considers a new functional form to better capture the non-linear response of yields to weather,more » especially for extreme temperature and precipitation events, and now accounts for the effect of soil type. In- and out-of-sample validations show that the statistical emulators are able to replicate spatial patterns of yields crop levels and changes overtime projected by crop models reasonably well, although the accuracy of the emulators varies by model and by region. This study therefore provides a reliable and accessible alternative to global gridded crop yield models. By emulating crop yields for several models using parsimonious equations, the tools provide a computationally efficient method to account for uncertainty in climate change impact assessments.« less

  12. Examining the DSM-5 alternative model of personality disorders operationalization of obsessive-compulsive personality disorder in a mental health sample.

    PubMed

    Liggett, Jacqueline; Sellbom, Martin

    2018-06-21

    The current study evaluated the continuity between the diagnostic operationalizations of obsessive-compulsive personality disorder (OCPD) in the Diagnostic and Statistical Manual for Mental Disorders, Fifth Edition, both as traditionally operationalized and from the perspective of the alternative model of personality disorders. Using both self-report and informant measures, the study had the following four aims: (a) to examine the extent to which self-report and informant data correspond, (b) to investigate whether both self-report and informant measures of the alternative model of OCPD can predict traditional OCPD, (c) to determine if any traits additional to those proposed in the alternative model of OCPD can predict traditional OCPD, and (d) to investigate whether a measure of OCPD-specific impairment is better at predicting traditional OCPD than are measures of general impairment in personality functioning. A mental health sample of 214 participants was recruited and administered measures of both the traditional and alternative models of OCPD. Self-report data moderately corresponded with informant data, which is consistent with the literature. Results further confirmed rigid perfectionism as the core trait of OCPD. Perseveration and workaholism were also associated with OCPD. Hostility was identified as a trait deserving further research. A measure of OCPD-specific impairment demonstrated its ability to incrementally predict OCPD over general measures of impairment. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  13. Addressing the mischaracterization of extreme rainfall in regional climate model simulations - A synoptic pattern based bias correction approach

    NASA Astrophysics Data System (ADS)

    Li, Jingwan; Sharma, Ashish; Evans, Jason; Johnson, Fiona

    2018-01-01

    Addressing systematic biases in regional climate model simulations of extreme rainfall is a necessary first step before assessing changes in future rainfall extremes. Commonly used bias correction methods are designed to match statistics of the overall simulated rainfall with observations. This assumes that change in the mix of different types of extreme rainfall events (i.e. convective and non-convective) in a warmer climate is of little relevance in the estimation of overall change, an assumption that is not supported by empirical or physical evidence. This study proposes an alternative approach to account for the potential change of alternate rainfall types, characterized here by synoptic weather patterns (SPs) using self-organizing maps classification. The objective of this study is to evaluate the added influence of SPs on the bias correction, which is achieved by comparing the corrected distribution of future extreme rainfall with that using conventional quantile mapping. A comprehensive synthetic experiment is first defined to investigate the conditions under which the additional information of SPs makes a significant difference to the bias correction. Using over 600,000 synthetic cases, statistically significant differences are found to be present in 46% cases. This is followed by a case study over the Sydney region using a high-resolution run of the Weather Research and Forecasting (WRF) regional climate model, which indicates a small change in the proportions of the SPs and a statistically significant change in the extreme rainfall over the region, although the differences between the changes obtained from the two bias correction methods are not statistically significant.

  14. Detecting temporal change in freshwater fisheries surveys: statistical power and the important linkages between management questions and monitoring objectives

    USGS Publications Warehouse

    Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,

    2016-01-01

    Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.

  15. Occupational injury costs and alternative employment in construction trades.

    PubMed

    Waehrer, Geetha M; Dong, Xiuwen S; Miller, Ted; Men, Yurong; Haile, Elizabeth

    2007-11-01

    To present the costs of fatal and non-fatal days-away-from-work injuries in 50 construction occupations. Our results also provide indirect evidence on the cost exposure of alternative construction workers such as independent contractors, on-call or day labor, contract workers, and temporary workers. We combine data from the Bureau of Labor Statistics on average annual incidence from 2000 to 2002 with updated per-case costs from an existing cost model for occupational injuries. The Current Population Survey provides data on the percentage of alternative construction workers. Construction laborers and carpenters were the two costliest occupations, with 40% of the industry's injury costs. The 10 costliest construction occupations also have a high percentage of alternative workers. The construction industry has both a high rate of alternative employment and high costs of work injury. Alternative workers, often lacking workers' compensation, are especially exposed to injury costs.

  16. Statistical modeling of yield and variance instability in conventional and organic cropping systems

    USDA-ARS?s Scientific Manuscript database

    Cropping systems research was undertaken to address declining crop diversity and verify competitiveness of alternatives to the predominant conventional cropping system in the northern Corn Belt. To understand and capitalize on temporal yield variability within corn and soybean fields, we quantified ...

  17. A STATISTICAL MODELING METHODOLOGY FOR THE DETECTION, QUANTIFICATION, AND PREDICTION OF ECOLOGICAL THRESHOLDS

    EPA Science Inventory

    This study will provide a general methodology for integrating threshold information from multiple species ecological metrics, allow for prediction of changes of alternative stable states, and provide a risk assessment tool that can be applied to adaptive management. The integr...

  18. Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate

    NASA Astrophysics Data System (ADS)

    Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno

    2017-03-01

    This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.

  19. Description of surface transport in the region of the Belizean Barrier Reef based on observations and alternative high-resolution models

    NASA Astrophysics Data System (ADS)

    Lindo-Atichati, D.; Curcic, M.; Paris, C. B.; Buston, P. M.

    2016-10-01

    The gains from implementing high-resolution versus less costly low-resolution models to describe coastal circulation are not always clear, often lacking statistical evaluation. Here we construct a hierarchy of ocean-atmosphere models operating at multiple scales within a 1 × 1° domain of the Belizean Barrier Reef (BBR). The various components of the atmosphere-ocean models are evaluated with in situ observations of surface drifters, wind and sea surface temperature. First, we compare the dispersion and velocity of 55 surface drifters released in the field in summer 2013 to the dispersion and velocity of simulated drifters under alternative model configurations. Increasing the resolution of the ocean model (from 1/12° to 1/100°, from 1 day to 1 h) and atmosphere model forcing (from 1/2° to 1/100°, from 6 h to 1 h), and incorporating tidal forcing incrementally reduces discrepancy between simulated and observed velocities and dispersion. Next, in trying to understand why the high-resolution models improve prediction, we find that resolving both the diurnal sea-breeze and semi-diurnal tides is key to improving the Lagrangian statistics and transport predictions along the BBR. Notably, the model with the highest ocean-atmosphere resolution and with tidal forcing generates a higher number of looping trajectories and sub-mesoscale coherent structures that are otherwise unresolved. Finally, simulations conducted with this model from June to August of 2013 show an intensification of the velocity fields throughout the summer and reveal a mesoscale anticyclonic circulation around Glovers Reef, and sub-mesoscale cyclonic eddies formed in the vicinity of Columbus Island. This study provides a general framework to assess the best surface transport prediction from alternative ocean-atmosphere models using metrics derived from high frequency drifters' data and meteorological stations.

  20. Evaluating model structure adequacy: The case of the Maggia Valley groundwater system, southern Switzerland

    USGS Publications Warehouse

    Hill, Mary C.; L. Foglia,; S. W. Mehl,; P. Burlando,

    2013-01-01

    Model adequacy is evaluated with alternative models rated using model selection criteria (AICc, BIC, and KIC) and three other statistics. Model selection criteria are tested with cross-validation experiments and insights for using alternative models to evaluate model structural adequacy are provided. The study is conducted using the computer codes UCODE_2005 and MMA (MultiModel Analysis). One recharge alternative is simulated using the TOPKAPI hydrological model. The predictions evaluated include eight heads and three flows located where ecological consequences and model precision are of concern. Cross-validation is used to obtain measures of prediction accuracy. Sixty-four models were designed deterministically and differ in representation of river, recharge, bedrock topography, and hydraulic conductivity. Results include: (1) What may seem like inconsequential choices in model construction may be important to predictions. Analysis of predictions from alternative models is advised. (2) None of the model selection criteria consistently identified models with more accurate predictions. This is a disturbing result that suggests to reconsider the utility of model selection criteria, and/or the cross-validation measures used in this work to measure model accuracy. (3) KIC displayed poor performance for the present regression problems; theoretical considerations suggest that difficulties are associated with wide variations in the sensitivity term of KIC resulting from the models being nonlinear and the problems being ill-posed due to parameter correlations and insensitivity. The other criteria performed somewhat better, and similarly to each other. (4) Quantities with high leverage are more difficult to predict. The results are expected to be generally applicable to models of environmental systems.

  1. Introducing linear functions: an alternative statistical approach

    NASA Astrophysics Data System (ADS)

    Nolan, Caroline; Herbert, Sandra

    2015-12-01

    The introduction of linear functions is the turning point where many students decide if mathematics is useful or not. This means the role of parameters and variables in linear functions could be considered to be `threshold concepts'. There is recognition that linear functions can be taught in context through the exploration of linear modelling examples, but this has its limitations. Currently, statistical data is easily attainable, and graphics or computer algebra system (CAS) calculators are common in many classrooms. The use of this technology provides ease of access to different representations of linear functions as well as the ability to fit a least-squares line for real-life data. This means these calculators could support a possible alternative approach to the introduction of linear functions. This study compares the results of an end-of-topic test for two classes of Australian middle secondary students at a regional school to determine if such an alternative approach is feasible. In this study, test questions were grouped by concept and subjected to concept by concept analysis of the means of test results of the two classes. This analysis revealed that the students following the alternative approach demonstrated greater competence with non-standard questions.

  2. Predictors of the number of under-five malnourished children in Bangladesh: application of the generalized poisson regression model

    PubMed Central

    2013-01-01

    Background Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. Methods The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. Results The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance < mean) property. Our study also identify several significant predictors of the outcome variable namely mother’s education, father’s education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Conclusions Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh. PMID:23297699

  3. Pathways Between Marriage and Parenting for Wives and Husbands: The Role of Coparenting1

    PubMed Central

    Morrill, Melinda

    2016-01-01

    As family systems research has expanded, so have investigations into how marital partners coparent together. Although coparenting research has increasingly found support for the influential role of coparenting on both marital relationships and parenting practices, coparenting has traditionally been investigated as part of an indirect system which begins with marital health, is mediated by coparenting processes, and then culminates in each partner's parenting. The field has not tested how this traditional model compares to the equally plausible alternative model in which coparenting simultaneously predicts both marital relationships and parenting practices. Furthermore, statistical and practical limitations have typically resulted in only one parent being analyzed in these models. This study used model-fitting analyses to include both wives and husbands in a test of these two alternative models of the role of coparenting in the family system. Our data suggested that both the traditional indirect model (marital health to coparenting to parenting practices), and the alternative predictor model where coparenting alliance directly and simultaneously predicts marital health and parenting practices, fit for both spouses. This suggests that dynamic and multiple roles may be played by coparenting in the overall family system, and raises important practical implications for family clinicians. PMID:20377635

  4. Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics

    PubMed Central

    2011-01-01

    Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747

  5. Problems With Risk Reclassification Methods for Evaluating Prediction Models

    PubMed Central

    Pepe, Margaret S.

    2011-01-01

    For comparing the performance of a baseline risk prediction model with one that includes an additional predictor, a risk reclassification analysis strategy has been proposed. The first step is to cross-classify risks calculated according to the 2 models for all study subjects. Summary measures including the percentage of reclassification and the percentage of correct reclassification are calculated, along with 2 reclassification calibration statistics. The author shows that interpretations of the proposed summary measures and P values are problematic. The author's recommendation is to display the reclassification table, because it shows interesting information, but to use alternative methods for summarizing and comparing model performance. The Net Reclassification Index has been suggested as one alternative method. The author argues for reporting components of the Net Reclassification Index because they are more clinically relevant than is the single numerical summary measure. PMID:21555714

  6. Rank score and permutation testing alternatives for regression quantile estimates

    USGS Publications Warehouse

    Cade, B.S.; Richards, J.D.; Mielke, P.W.

    2006-01-01

    Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.

  7. Analysis of visual quality improvements provided by known tools for HDR content

    NASA Astrophysics Data System (ADS)

    Kim, Jaehwan; Alshina, Elena; Lee, JongSeok; Park, Youngo; Choi, Kwang Pyo

    2016-09-01

    In this paper, the visual quality of different solutions for high dynamic range (HDR) compression using MPEG test contents is analyzed. We also simulate the method for an efficient HDR compression which is based on statistical property of the signal. The method is compliant with HEVC specification and also easily compatible with other alternative methods which might require HEVC specification changes. It was subjectively tested on commercial TVs and compared with alternative solutions for HDR coding. Subjective visual quality tests were performed using SUHD TVs model which is SAMSUNG JS9500 with maximum luminance up to 1000nit in test. The solution that is based on statistical property shows not only improvement of objective performance but improvement of visual quality compared to other HDR solutions, while it is compatible with HEVC specification.

  8. Utilizing interview and self-report assessment of the Five-Factor Model to examine convergence with the alternative model for personality disorders.

    PubMed

    Helle, Ashley C; Trull, Timothy J; Widiger, Thomas A; Mullins-Sweatt, Stephanie N

    2017-07-01

    An alternative model for personality disorders is included in Section III (Emerging Models and Measures) of Diagnostic and Statistical Manual of Mental Disorders, (5th ed.; DSM-5). The DSM-5 dimensional trait model is an extension of the Five-Factor Model (FFM; American Psychiatric Association, 2013). The Personality Inventory for DSM-5 (PID-5) assesses the 5 domains and 25 traits in the alternative model. The current study expands on recent research to examine the relationship of the PID-5 with an interview measure of the FFM. The Structured Interview for the Five Factor Model of Personality (SIFFM) assesses the 5 bipolar domains and 30 facets of the FFM. Research has indicated that the SIFFM captures maladaptive aspects of personality (as well as adaptive). The SIFFM, NEO PI-R, and PID-5 were administered to participants to examine their respective convergent and discriminant validity. Results provide evidence for the convergence of the 2 models using self-report and interview measures of the FFM. Clinical implications and future directions are discussed, particularly a call for the development of a structured interview for the assessment of the DSM-5 dimensional trait model. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  9. Genetic Programming as Alternative for Predicting Development Effort of Individual Software Projects

    PubMed Central

    Chavoya, Arturo; Lopez-Martin, Cuauhtemoc; Andalon-Garcia, Irma R.; Meda-Campaña, M. E.

    2012-01-01

    Statistical and genetic programming techniques have been used to predict the software development effort of large software projects. In this paper, a genetic programming model was used for predicting the effort required in individually developed projects. Accuracy obtained from a genetic programming model was compared against one generated from the application of a statistical regression model. A sample of 219 projects developed by 71 practitioners was used for generating the two models, whereas another sample of 130 projects developed by 38 practitioners was used for validating them. The models used two kinds of lines of code as well as programming language experience as independent variables. Accuracy results from the model obtained with genetic programming suggest that it could be used to predict the software development effort of individual projects when these projects have been developed in a disciplined manner within a development-controlled environment. PMID:23226305

  10. The Effects of Local Economic Conditions on Navy Enlistments.

    DTIC Science & Technology

    1980-03-18

    Standard Metropolitan Statistical Area (SMSA) as the basic economic unit, cross-sectional regression models were constructed for enlistment rate, recruiter...to eligible population suggesting that a cheaper alternative to raising mili- tary wages would be to increase the number of recruiters. Arima (1978...is faced with a number of cri- teria that must be satisfied by an acceptable test variable. As with other variables included in the model , economic

  11. Parametric evaluation of the cost effectiveness of Shuttle payload vibroacoustic test plans

    NASA Technical Reports Server (NTRS)

    Stahle, C. V.; Gongloff, H. R.; Keegan, W. B.; Young, J. P.

    1978-01-01

    Consideration is given to alternate vibroacoustic test plans for sortie and free flyer Shuttle payloads. Statistical decision models for nine test plans provide a viable method of evaluating the cost effectiveness of alternate vibroacoustic test plans and the associated test levels. The methodology is a major step toward the development of a useful tool for the quantitative tailoring of vibroacoustic test programs to sortie and free flyer payloads. A broader application of the methodology is now possible by the use of the OCTAVE computer code.

  12. PAH concentrations simulated with the AURAMS-PAH chemical transport model over Canada and the USA

    NASA Astrophysics Data System (ADS)

    Galarneau, E.; Makar, P. A.; Zheng, Q.; Narayan, J.; Zhang, J.; Moran, M. D.; Bari, M. A.; Pathela, S.; Chen, A.; Chlumsky, R.

    2013-07-01

    The off-line Eulerian AURAMS chemical transport model was adapted to simulate the atmospheric fate of seven PAHs: phenanthrene, anthracene, fluoranthene, pyrene, benz[a]anthracene, chrysene + triphenylene, and benzo[a]pyrene. The model was then run for the year 2002 with hourly output on a~grid covering southern Canada and the continental USA with 42 km horizontal grid spacing. Model predictions were compared to ~ 5000 24 h average PAH measurements from 45 sites, eight of which also provided data on particle/gas partitioning which had been modelled using two alternative schemes. This is the first known regional modelling study for PAHs over a North American domain and the first modelling study at any scale to compare alternative particle/gas partitioning schemes against paired field measurements. Annual average modelled total (gas + particle) concentrations were statistically indistinguishable from measured values for fluoranthene, pyrene and benz[a]anthracene whereas the model underestimated concentrations of phenanthrene, anthracene and chrysene + triphenylene. Significance for benzo[a]pyrene performance was close to the statistical threshold and depended on the particle/gas partitioning scheme employed. On a day-to-day basis, the model simulated total PAH concentrations to the correct order of magnitude the majority of the time. Model performance differed substantially between measurement locations and the limited available evidence suggests that the model spatial resolution was too coarse to capture the distribution of concentrations in densely populated areas. A more detailed analysis of the factors influencing modelled particle/gas partitioning is warranted based on the findings in this study.

  13. Developing International Guidelines on Volcanic Hazard Assessments for Nuclear Facilities

    NASA Astrophysics Data System (ADS)

    Connor, Charles

    2014-05-01

    Worldwide, tremendous progress has been made in recent decades in forecasting volcanic events, such as episodes of volcanic unrest, eruptions, and the potential impacts of eruptions. Generally these forecasts are divided into two categories. Short-term forecasts are prepared in response to unrest at volcanoes, rely on geophysical monitoring and related observations, and have the goal of forecasting events on timescales of hours to weeks to provide time for evacuation of people, shutdown of facilities, and implementation of related safety measures. Long-term forecasts are prepared to better understand the potential impacts of volcanism in the future and to plan for potential volcanic activity. Long-term forecasts are particularly useful to better understand and communicate the potential consequences of volcanic events for populated areas around volcanoes and for siting critical infrastructure, such as nuclear facilities. Recent work by an international team, through the auspices of the International Atomic Energy Agency, has focused on developing guidelines for long-term volcanic hazard assessments. These guidelines have now been implemented for hazard assessment for nuclear facilities in nations including Indonesia, the Philippines, Armenia, Chile, and the United States. One any time scale, all volcanic hazard assessments rely on a geologically reasonable conceptual model of volcanism. Such conceptual models are usually built upon years or decades of geological studies of specific volcanic systems, analogous systems, and development of a process-level understanding of volcanic activity. Conceptual models are used to bound potential rates of volcanic activity, potential magnitudes of eruptions, and to understand temporal and spatial trends in volcanic activity. It is these conceptual models that provide essential justification for assumptions made in statistical model development and the application of numerical models to generate quantitative forecasts. It is a tremendous challenge in quantitative volcanic hazard assessments to encompass alternative conceptual models, and to create models that are robust to evolving understanding of specific volcanic systems by the scientific community. A central question in volcanic hazards forecasts is quantifying rates of volcanic activity. Especially for long-dormant volcanic systems, data from the geologic record may be sparse, individual events may be missing or unrecognized in the geologic record, patterns of activity may be episodic or otherwise nonstationary. This leads to uncertainty in forecasting long-term rates of activity. Hazard assessments strive to quantify such uncertainty, for example by comparing observed rates of activity with alternative parametric and nonparametric models. Numerical models are presented that characterize the spatial distribution of potential volcanic events. These spatial density models serve as the basis for application of numerical models of specific phenomena such as development of lava flow, tephra fallout, and a host of other volcanic phenomena. Monte Carlo techniques (random sampling, stratified sampling, importance sampling) are methods used to sample vent location and other key eruption parameters, such as eruption volume, magma rheology, and eruption column height for probabilistic models. The development of coupled scenarios (e.g., the probability of tephra accumulation on a slope resulting in subsequent debris flows) is also assessed through these methods, usually with the aid of event trees. The primary products of long-term forecasts are a statistical model of the conditional probability of the potential effects of volcanism, should an eruption occur, and the probability of such activity occurring. It is emphasized that hazard forecasting is an iterative process, and board consideration must be given to alternative conceptual models of volcanism, weighting of volcanological data in the analyses, and alternative statistical and numerical models. This structure is amenable to expert elicitation in order to weight alternative models and to explore alternative scenarios.

  14. Generalized linear and generalized additive models in studies of species distributions: Setting the scene

    USGS Publications Warehouse

    Guisan, Antoine; Edwards, T.C.; Hastie, T.

    2002-01-01

    An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.

  15. Use of a statistical model of the whole femur in a large scale, multi-model study of femoral neck fracture risk.

    PubMed

    Bryan, Rebecca; Nair, Prasanth B; Taylor, Mark

    2009-09-18

    Interpatient variability is often overlooked in orthopaedic computational studies due to the substantial challenges involved in sourcing and generating large numbers of bone models. A statistical model of the whole femur incorporating both geometric and material property variation was developed as a potential solution to this problem. The statistical model was constructed using principal component analysis, applied to 21 individual computer tomography scans. To test the ability of the statistical model to generate realistic, unique, finite element (FE) femur models it was used as a source of 1000 femurs to drive a study on femoral neck fracture risk. The study simulated the impact of an oblique fall to the side, a scenario known to account for a large proportion of hip fractures in the elderly and have a lower fracture load than alternative loading approaches. FE model generation, application of subject specific loading and boundary conditions, FE processing and post processing of the solutions were completed automatically. The generated models were within the bounds of the training data used to create the statistical model with a high mesh quality, able to be used directly by the FE solver without remeshing. The results indicated that 28 of the 1000 femurs were at highest risk of fracture. Closer analysis revealed the percentage of cortical bone in the proximal femur to be a crucial differentiator between the failed and non-failed groups. The likely fracture location was indicated to be intertrochantic. Comparison to previous computational, clinical and experimental work revealed support for these findings.

  16. Analysis of Pulsed Flow Modification Alternatives, Lower Missouri River, 2005

    USGS Publications Warehouse

    Jacobson, Robert B.

    2008-01-01

    The graphical, tabular, and statistical data presented in this report resulted from analysis of alternative flow regime designs considered by a group of Missouri River managers, stakeholders, and scientists during the summer of 2005. This plenary group was charged with designing a flow regime with increased spring flow pulses to support reproduction and survival of the endangered pallid sturgeon. Environmental flow components extracted from the reference natural flow regime were used to design and assess performance of alternative flow regimes. The analysis is based on modeled flow releases from Gavins Point Dam (near Yankton, South Dakota) for nine design alternatives and two reference scenarios; the reference scenarios are the run-of-the-river and the water-control plan implemented in 2004. The alternative designs were developed by the plenary group with the goal of providing pulsed spring flows, while retaining traditional social and economic uses of the river.

  17. Competing risks models and time-dependent covariates

    PubMed Central

    Barnett, Adrian; Graves, Nick

    2008-01-01

    New statistical models for analysing survival data in an intensive care unit context have recently been developed. Two models that offer significant advantages over standard survival analyses are competing risks models and multistate models. Wolkewitz and colleagues used a competing risks model to examine survival times for nosocomial pneumonia and mortality. Their model was able to incorporate time-dependent covariates and so examine how risk factors that changed with time affected the chances of infection or death. We briefly explain how an alternative modelling technique (using logistic regression) can more fully exploit time-dependent covariates for this type of data. PMID:18423067

  18. Evaluation of Regression Models of Balance Calibration Data Using an Empirical Criterion

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert; Volden, Thomas R.

    2012-01-01

    An empirical criterion for assessing the significance of individual terms of regression models of wind tunnel strain gage balance outputs is evaluated. The criterion is based on the percent contribution of a regression model term. It considers a term to be significant if its percent contribution exceeds the empirical threshold of 0.05%. The criterion has the advantage that it can easily be computed using the regression coefficients of the gage outputs and the load capacities of the balance. First, a definition of the empirical criterion is provided. Then, it is compared with an alternate statistical criterion that is widely used in regression analysis. Finally, calibration data sets from a variety of balances are used to illustrate the connection between the empirical and the statistical criterion. A review of these results indicated that the empirical criterion seems to be suitable for a crude assessment of the significance of a regression model term as the boundary between a significant and an insignificant term cannot be defined very well. Therefore, regression model term reduction should only be performed by using the more universally applicable statistical criterion.

  19. Uncertainties in obtaining high reliability from stress-strength models

    NASA Technical Reports Server (NTRS)

    Neal, Donald M.; Matthews, William T.; Vangel, Mark G.

    1992-01-01

    There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.

  20. Statistics, Handle with Care: Detecting Multiple Model Components with the Likelihood Ratio Test

    NASA Astrophysics Data System (ADS)

    Protassov, Rostislav; van Dyk, David A.; Connors, Alanna; Kashyap, Vinay L.; Siemiginowska, Aneta

    2002-05-01

    The likelihood ratio test (LRT) and the related F-test, popularized in astrophysics by Eadie and coworkers in 1971, Bevington in 1969, Lampton, Margon, & Bowyer, in 1976, Cash in 1979, and Avni in 1978, do not (even asymptotically) adhere to their nominal χ2 and F-distributions in many statistical tests common in astrophysics, thereby casting many marginal line or source detections and nondetections into doubt. Although the above authors illustrate the many legitimate uses of these statistics, in some important cases it can be impossible to compute the correct false positive rate. For example, it has become common practice to use the LRT or the F-test to detect a line in a spectral model or a source above background despite the lack of certain required regularity conditions. (These applications were not originally suggested by Cash or by Bevington.) In these and other settings that involve testing a hypothesis that is on the boundary of the parameter space, contrary to common practice, the nominal χ2 distribution for the LRT or the F-distribution for the F-test should not be used. In this paper, we characterize an important class of problems in which the LRT and the F-test fail and illustrate this nonstandard behavior. We briefly sketch several possible acceptable alternatives, focusing on Bayesian posterior predictive probability values. We present this method in some detail since it is a simple, robust, and intuitive approach. This alternative method is illustrated using the gamma-ray burst of 1997 May 8 (GRB 970508) to investigate the presence of an Fe K emission line during the initial phase of the observation. There are many legitimate uses of the LRT and the F-test in astrophysics, and even when these tests are inappropriate, there remain several statistical alternatives (e.g., judicious use of error bars and Bayes factors). Nevertheless, there are numerous cases of the inappropriate use of the LRT and similar tests in the literature, bringing substantive scientific results into question.

  1. Tools and Techniques for Basin-Scale Climate Change Assessment

    NASA Astrophysics Data System (ADS)

    Zagona, E.; Rajagopalan, B.; Oakley, W.; Wilson, N.; Weinstein, P.; Verdin, A.; Jerla, C.; Prairie, J. R.

    2012-12-01

    The Department of Interior's WaterSMART Program seeks to secure and stretch water supplies to benefit future generations and identify adaptive measures to address climate change. Under WaterSMART, Basin Studies are comprehensive water studies to explore options for meeting projected imbalances in water supply and demand in specific basins. Such studies could be most beneficial with application of recent scientific advances in climate projections, stochastic simulation, operational modeling and robust decision-making, as well as computational techniques to organize and analyze many alternatives. A new integrated set of tools and techniques to facilitate these studies includes the following components: Future supply scenarios are produced by the Hydrology Simulator, which uses non-parametric K-nearest neighbor resampling techniques to generate ensembles of hydrologic traces based on historical data, optionally conditioned on long paleo reconstructed data using various Markov Chain techniuqes. Resampling can also be conditioned on climate change projections from e.g., downscaled GCM projections to capture increased variability; spatial and temporal disaggregation is also provided. The simulations produced are ensembles of hydrologic inputs to the RiverWare operations/infrastucture decision modeling software. Alternative demand scenarios can be produced with the Demand Input Tool (DIT), an Excel-based tool that allows modifying future demands by groups such as states; sectors, e.g., agriculture, municipal, energy; and hydrologic basins. The demands can be scaled at future dates or changes ramped over specified time periods. Resulting data is imported directly into the decision model. Different model files can represent infrastructure alternatives and different Policy Sets represent alternative operating policies, including options for noticing when conditions point to unacceptable vulnerabilities, which trigger dynamically executing changes in operations or other options. The over-arching Study Manager provides a graphical tool to create combinations of future supply scenarios, demand scenarios, infrastructure and operating policy alternatives; each scenario is executed as an ensemble of RiverWare runs, driven by the hydrologic supply. The Study Manager sets up and manages multiple executions on multi-core hardware. The sizeable are typically direct model outputs, or post-processed indicators of performance based on model outputs. Post processing statistical analysis of the outputs are possible using the Graphical Policy Analysis Tool or other statistical packages. Several Basin Studies undertaken have used RiverWare to evaluate future scenarios. The Colorado River Basin Study, the most complex and extensive to date, has taken advantage of these tools and techniques to generate supply scenarios, produce alternative demand scenarios and to set up and execute the many combinations of supplies, demands, policies, and infrastructure alternatives. The tools and techniques will be described with example applications.

  2. Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

    PubMed

    Lin, Feng-Chang; Zhu, Jun

    2012-01-01

    We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.

  3. Optimizing spectral wave estimates with adjoint-based sensitivity maps

    NASA Astrophysics Data System (ADS)

    Orzech, Mark; Veeramony, Jay; Flampouris, Stylianos

    2014-04-01

    A discrete numerical adjoint has recently been developed for the stochastic wave model SWAN. In the present study, this adjoint code is used to construct spectral sensitivity maps for two nearshore domains. The maps display the correlations of spectral energy levels throughout the domain with the observed energy levels at a selected location or region of interest (LOI/ROI), providing a full spectrum of values at all locations in the domain. We investigate the effectiveness of sensitivity maps based on significant wave height ( H s ) in determining alternate offshore instrument deployment sites when a chosen nearshore location or region is inaccessible. Wave and bathymetry datasets are employed from one shallower, small-scale domain (Duck, NC) and one deeper, larger-scale domain (San Diego, CA). The effects of seasonal changes in wave climate, errors in bathymetry, and multiple assimilation points on sensitivity map shapes and model performance are investigated. Model accuracy is evaluated by comparing spectral statistics as well as with an RMS skill score, which estimates a mean model-data error across all spectral bins. Results indicate that data assimilation from identified high-sensitivity alternate locations consistently improves model performance at nearshore LOIs, while assimilation from low-sensitivity locations results in lesser or no improvement. Use of sub-sampled or alongshore-averaged bathymetry has a domain-specific effect on model performance when assimilating from a high-sensitivity alternate location. When multiple alternate assimilation locations are used from areas of lower sensitivity, model performance may be worse than with a single, high-sensitivity assimilation point.

  4. The proposed 'concordance-statistic for benefit' provided a useful metric when modeling heterogeneous treatment effects.

    PubMed

    van Klaveren, David; Steyerberg, Ewout W; Serruys, Patrick W; Kent, David M

    2018-02-01

    Clinical prediction models that support treatment decisions are usually evaluated for their ability to predict the risk of an outcome rather than treatment benefit-the difference between outcome risk with vs. without therapy. We aimed to define performance metrics for a model's ability to predict treatment benefit. We analyzed data of the Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) trial and of three recombinant tissue plasminogen activator trials. We assessed alternative prediction models with a conventional risk concordance-statistic (c-statistic) and a novel c-statistic for benefit. We defined observed treatment benefit by the outcomes in pairs of patients matched on predicted benefit but discordant for treatment assignment. The 'c-for-benefit' represents the probability that from two randomly chosen matched patient pairs with unequal observed benefit, the pair with greater observed benefit also has a higher predicted benefit. Compared to a model without treatment interactions, the SYNTAX score II had improved ability to discriminate treatment benefit (c-for-benefit 0.590 vs. 0.552), despite having similar risk discrimination (c-statistic 0.725 vs. 0.719). However, for the simplified stroke-thrombolytic predictive instrument (TPI) vs. the original stroke-TPI, the c-for-benefit (0.584 vs. 0.578) was similar. The proposed methodology has the potential to measure a model's ability to predict treatment benefit not captured with conventional performance metrics. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Alternative Matching Scores to Control Type I Error of the Mantel-Haenszel Procedure for DIF in Dichotomously Scored Items Conforming to 3PL IRT and Nonparametric 4PBCB Models

    ERIC Educational Resources Information Center

    Monahan, Patrick O.; Ankenmann, Robert D.

    2010-01-01

    When the matching score is either less than perfectly reliable or not a sufficient statistic for determining latent proficiency in data conforming to item response theory (IRT) models, Type I error (TIE) inflation may occur for the Mantel-Haenszel (MH) procedure or any differential item functioning (DIF) procedure that matches on summed-item…

  6. Summary Diagrams for Coupled Hydrodynamic-Ecosystem Model Skill Assessment

    DTIC Science & Technology

    2009-01-01

    reference point have the smallest unbiased RMSD value (Fig. 3). It would appear that the cluster of model points closest to the reference point may...total RMSD values. This is particularly the case for phyto- plankton absorption (Fig. 3B) where the cluster of points closest to the reference...pattern statistics and the bias (difference of mean values) each magnitude of the total Root-Mean-Square Difference ( RMSD ). An alternative skill score and

  7. Multilevel modelling: Beyond the basic applications.

    PubMed

    Wright, Daniel B; London, Kamala

    2009-05-01

    Over the last 30 years statistical algorithms have been developed to analyse datasets that have a hierarchical/multilevel structure. Particularly within developmental and educational psychology these techniques have become common where the sample has an obvious hierarchical structure, like pupils nested within a classroom. We describe two areas beyond the basic applications of multilevel modelling that are important to psychology: modelling the covariance structure in longitudinal designs and using generalized linear multilevel modelling as an alternative to methods from signal detection theory (SDT). Detailed code for all analyses is described using packages for the freeware R.

  8. Defining the formative discharge for alternate bars in alluvial rivers

    NASA Astrophysics Data System (ADS)

    Redolfi, M.; Carlin, M.; Tubino, M.; Adami, L.; Zolezzi, G.

    2017-12-01

    We investigate the properties of alternate bars in long straight reaches of channelized streams subject to an unsteady, irregular flow regime. To this aim we propose a novel integration of a statistical approach with the analytical perturbation model of Tubino (1991) which predicts the evolution of bar properties (namely amplitude and wavelength) as consequence of a flood. The outcomes of our integrated modelling approach are probability distribution of the bar properties, which depend essentially on two ingredients: (i) the statistical properties of the flow regime (duration, frequency and magnitude of the flood events, and (ii) the reach-averaged hydro-geomorphic characteristics of the channel (bed material, channel gradient and width). This allows to define a "bar-forming" discharge value as the flow value which would reproduce the most likely bar properties in a river reach under unsteady flow. Alternate bars are often migrating downstream and growing or declining during flood events. The timescale of bar growth and migration is often comparable with the duration of the floods: consequently, bar properties such as height and wavelength do not respond instantaneously to discharge variations (i.e. quasi-equilibrium response) but may depend on previous flood events. Theoretical results are compared with observations in three Alpine, channelized gravel bed rivers with encouraging outcomes.

  9. A systematic review of Bayesian articles in psychology: The last 25 years.

    PubMed

    van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah

    2017-06-01

    Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

    PubMed

    Rivas, Elena; Lang, Raymond; Eddy, Sean R

    2012-02-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.

  11. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

    PubMed Central

    Rivas, Elena; Lang, Raymond; Eddy, Sean R.

    2012-01-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308

  12. Psychological Assessment with the DSM-5 Alternative Model for Personality Disorders: Tradition and Innovation

    PubMed Central

    Waugh, Mark H.; Hopwood, Christopher J.; Krueger, Robert F.; Morey, Leslie C.; Pincus, Aaron L.; Wright, Aidan G. C.

    2016-01-01

    The Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-5) Section III Alternative Model for Personality Disorders (AMPD; APA, 2013) represents an innovative system for simultaneous psychiatric classification and psychological assessment of personality disorders (PD). The AMPD combines major paradigms of personality assessment and provides an original, heuristic, flexible, and practical framework that enriches clinical thinking and practice. Origins, emerging research, and clinical application of the AMPD for diagnosis and psychological assessment are reviewed. The AMPD integrates assessment and research traditions, facilitates case conceptualization, is easy to learn and use, and assists in providing patient feedback. New as well as existing tests and psychometric methods may be used to operationalize the AMPD for clinical assessments. PMID:28450760

  13. Psychological Assessment with the DSM-5 Alternative Model for Personality Disorders: Tradition and Innovation.

    PubMed

    Waugh, Mark H; Hopwood, Christopher J; Krueger, Robert F; Morey, Leslie C; Pincus, Aaron L; Wright, Aidan G C

    2017-04-01

    The Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-5) Section III Alternative Model for Personality Disorders (AMPD; APA, 2013) represents an innovative system for simultaneous psychiatric classification and psychological assessment of personality disorders (PD). The AMPD combines major paradigms of personality assessment and provides an original, heuristic, flexible, and practical framework that enriches clinical thinking and practice. Origins, emerging research, and clinical application of the AMPD for diagnosis and psychological assessment are reviewed. The AMPD integrates assessment and research traditions, facilitates case conceptualization, is easy to learn and use, and assists in providing patient feedback. New as well as existing tests and psychometric methods may be used to operationalize the AMPD for clinical assessments.

  14. DSM-5 alternative personality disorder model traits as maladaptive extreme variants of the five-factor model: An item-response theory analysis.

    PubMed

    Suzuki, Takakuni; Samuel, Douglas B; Pahlen, Shandell; Krueger, Robert F

    2015-05-01

    Over the past two decades, evidence has suggested that personality disorders (PDs) can be conceptualized as extreme, maladaptive variants of general personality dimensions, rather than discrete categorical entities. Recognizing this literature, the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) alternative PD model in Section III defines PDs partially through 25 maladaptive traits that fall within 5 domains. Empirical evidence based on the self-report measure of these traits, the Personality Inventory for DSM-5 (PID-5), suggests that these five higher-order domains share a structure and correlate in meaningful ways with the five-factor model (FFM) of general personality. In the current study, item response theory was used to compare the DSM-5 alternative PD model traits to those from a normative FFM inventory (the International Personality Item Pool-NEO [IPIP-NEO]) in terms of their measurement precision along the latent dimensions. Within a combined sample of 3,517 participants, results strongly supported the conclusion that the DSM-5 alternative PD model traits and IPIP-NEO traits are complimentary measures of 4 of the 5 FFM domains (with perhaps the exception of openness to experience vs. psychoticism). Importantly, the two measures yield largely overlapping information curves on these four domains. Differences that did emerge suggested that the PID-5 scales generally have higher thresholds and provide more information at the upper levels, whereas the IPIP-NEO generally had an advantage at the lower levels. These results support the general conceptualization that 4 domains of the DSM-5 alternative PD model traits are maladaptive, extreme versions of the FFM. (PsycINFO Database Record (c) 2015 APA, all rights reserved).

  15. Sensitivity of wildlife habitat models to uncertainties in GIS data

    NASA Technical Reports Server (NTRS)

    Stoms, David M.; Davis, Frank W.; Cogan, Christopher B.

    1992-01-01

    Decision makers need to know the reliability of output products from GIS analysis. For many GIS applications, it is not possible to compare these products to an independent measure of 'truth'. Sensitivity analysis offers an alternative means of estimating reliability. In this paper, we present a CIS-based statistical procedure for estimating the sensitivity of wildlife habitat models to uncertainties in input data and model assumptions. The approach is demonstrated in an analysis of habitat associations derived from a GIS database for the endangered California condor. Alternative data sets were generated to compare results over a reasonable range of assumptions about several sources of uncertainty. Sensitivity analysis indicated that condor habitat associations are relatively robust, and the results have increased our confidence in our initial findings. Uncertainties and methods described in the paper have general relevance for many GIS applications.

  16. Computational methods to extract meaning from text and advance theories of human cognition.

    PubMed

    McNamara, Danielle S

    2011-01-01

    Over the past two decades, researchers have made great advances in the area of computational methods for extracting meaning from text. This research has to a large extent been spurred by the development of latent semantic analysis (LSA), a method for extracting and representing the meaning of words using statistical computations applied to large corpora of text. Since the advent of LSA, researchers have developed and tested alternative statistical methods designed to detect and analyze meaning in text corpora. This research exemplifies how statistical models of semantics play an important role in our understanding of cognition and contribute to the field of cognitive science. Importantly, these models afford large-scale representations of human knowledge and allow researchers to explore various questions regarding knowledge, discourse processing, text comprehension, and language. This topic includes the latest progress by the leading researchers in the endeavor to go beyond LSA. Copyright © 2010 Cognitive Science Society, Inc.

  17. Assessing the Health of LiFePO4 Traction Batteries through Monotonic Echo State Networks

    PubMed Central

    Anseán, David; Otero, José; Couso, Inés

    2017-01-01

    A soft sensor is presented that approximates certain health parameters of automotive rechargeable batteries from on-vehicle measurements of current and voltage. The sensor is based on a model of the open circuit voltage curve. This last model is implemented through monotonic neural networks and estimate over-potentials arising from the evolution in time of the Lithium concentration in the electrodes of the battery. The proposed soft sensor is able to exploit the information contained in operational records of the vehicle better than the alternatives, this being particularly true when the charge or discharge currents are between moderate and high. The accuracy of the neural model has been compared to different alternatives, including data-driven statistical models, first principle-based models, fuzzy observers and other recurrent neural networks with different topologies. It is concluded that monotonic echo state networks can outperform well established first-principle models. The algorithms have been validated with automotive Li-FePO4 cells. PMID:29267219

  18. Eutrophication risk assessment in coastal embayments using simple statistical models.

    PubMed

    Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G

    2003-09-01

    A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.

  19. Detecting Answer Copying Using Alternate Test Forms and Seat Locations in Small-Scale Examinations

    ERIC Educational Resources Information Center

    van der Ark, L. Andries; Emons, Wilco H. M.; Sijtsma, Klaas

    2008-01-01

    Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the…

  20. Stochastic Partial Differential Equation Solver for Hydroacoustic Modeling: Improvements to Paracousti Sound Propagation Solver

    NASA Astrophysics Data System (ADS)

    Preston, L. A.

    2017-12-01

    Marine hydrokinetic (MHK) devices offer a clean, renewable alternative energy source for the future. Responsible utilization of MHK devices, however, requires that the effects of acoustic noise produced by these devices on marine life and marine-related human activities be well understood. Paracousti is a 3-D full waveform acoustic modeling suite that can accurately propagate MHK noise signals in the complex bathymetry found in the near-shore to open ocean environment and considers real properties of the seabed, water column, and air-surface interface. However, this is a deterministic simulation that assumes the environment and source are exactly known. In reality, environmental and source characteristics are often only known in a statistical sense. Thus, to fully characterize the expected noise levels within the marine environment, this uncertainty in environmental and source factors should be incorporated into the acoustic simulations. One method is to use Monte Carlo (MC) techniques where simulation results from a large number of deterministic solutions are aggregated to provide statistical properties of the output signal. However, MC methods can be computationally prohibitive since they can require tens of thousands or more simulations to build up an accurate representation of those statistical properties. An alternative method, using the technique of stochastic partial differential equations (SPDE), allows computation of the statistical properties of output signals at a small fraction of the computational cost of MC. We are developing a SPDE solver for the 3-D acoustic wave propagation problem called Paracousti-UQ to help regulators and operators assess the statistical properties of environmental noise produced by MHK devices. In this presentation, we present the SPDE method and compare statistical distributions of simulated acoustic signals in simple models to MC simulations to show the accuracy and efficiency of the SPDE method. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.

  1. Valuing a log: alternative approaches.

    Treesearch

    R.V. Nagubadi; R.D. Fight; R.J. Barbour

    2003-01-01

    The gross value of products that can be manufactured from a tree is the starting point for a residual-value appraisal of a forest operation involving the harvest of trees suitable for making forest products. The amount of detail in a model of gross product value will affect the statistical properties of the estimate and the amount of ancillary information that is...

  2. Clinical Issues in the Use of the "DSM-III-R" with African American Children: A Diagnostic Paradigm.

    ERIC Educational Resources Information Center

    Johnson, Ronn

    1993-01-01

    Reviews concerns related to diagnoses in delivery of mental health services, specifically, application of the principles of the Diagnostic and Statistical Manual of Mental Health Disorders (DSM-III-R) to African-American children. An alternative diagnostic model is proposed, and recommendations are made for enhancing the diagnostic process. (SLD)

  3. Synthesis of Single-Case Experimental Data: A Comparison of Alternative Multilevel Approaches

    ERIC Educational Resources Information Center

    Ferron, John; Van den Noortgate, Wim; Beretvas, Tasha; Moeyaert, Mariola; Ugille, Maaike; Petit-Bois, Merlande; Baek, Eun Kyeng

    2013-01-01

    Single-case or single-subject experimental designs (SSED) are used to evaluate the effect of one or more treatments on a single case. Although SSED studies are growing in popularity, the results are in theory case-specific. One systematic and statistical approach for combining single-case data within and across studies is multilevel modeling. The…

  4. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed

    Kong, A; Cox, N J

    1997-11-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.

  5. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed Central

    Kong, A; Cox, N J

    1997-01-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested. PMID:9345087

  6. New statistical potential for quality assessment of protein models and a survey of energy functions

    PubMed Central

    2010-01-01

    Background Scoring functions, such as molecular mechanic forcefields and statistical potentials are fundamentally important tools in protein structure modeling and quality assessment. Results The performances of a number of publicly available scoring functions are compared with a statistical rigor, with an emphasis on knowledge-based potentials. We explored the effect on accuracy of alternative choices for representing interaction center types and other features of scoring functions, such as using information on solvent accessibility, on torsion angles, accounting for secondary structure preferences and side chain orientation. Partially based on the observations made, we present a novel residue based statistical potential, which employs a shuffled reference state definition and takes into account the mutual orientation of residue side chains. Atom- and residue-level statistical potentials and Linux executables to calculate the energy of a given protein proposed in this work can be downloaded from http://www.fiserlab.org/potentials. Conclusions Among the most influential terms we observed a critical role of a proper reference state definition and the benefits of including information about the microenvironment of interaction centers. Molecular mechanical potentials were also tested and found to be over-sensitive to small local imperfections in a structure, requiring unfeasible long energy relaxation before energy scores started to correlate with model quality. PMID:20226048

  7. An alternative covariance estimator to investigate genetic heterogeneity in populations.

    PubMed

    Heslot, Nicolas; Jannink, Jean-Luc

    2015-11-26

    For genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship. We propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy. This alternative covariance estimator can be used to gain insight into trait-specific genetic heterogeneity by identifying relevant sub-populations that lack genetic correlation between them. Genetic correlation can be 0 between identified sub-populations by performing automatic selection of relevant sets of individuals to be included in the training population. It may also increase statistical power in GWAS.

  8. Nonequilibrium critical behavior of model statistical systems and methods for the description of its features

    NASA Astrophysics Data System (ADS)

    Prudnikov, V. V.; Prudnikov, P. V.; Mamonova, M. V.

    2017-11-01

    This paper reviews features in critical behavior of far-from-equilibrium macroscopic systems and presents current methods of describing them by referring to some model statistical systems such as the three-dimensional Ising model and the two-dimensional XY model. The paper examines the critical relaxation of homogeneous and structurally disordered systems subjected to abnormally strong fluctuation effects involved in ordering processes in solids at second-order phase transitions. Interest in such systems is due to the aging properties and fluctuation-dissipation theorem violations predicted for and observed in systems slowly evolving from a nonequilibrium initial state. It is shown that these features of nonequilibrium behavior show up in the magnetic properties of magnetic superstructures consisting of alternating nanoscale-thick magnetic and nonmagnetic layers and can be observed not only near the film’s critical ferromagnetic ordering temperature Tc, but also over the wide temperature range T ⩽ Tc.

  9. Plurality of Type A evaluations of uncertainty

    NASA Astrophysics Data System (ADS)

    Possolo, Antonio; Pintar, Adam L.

    2017-10-01

    The evaluations of measurement uncertainty involving the application of statistical methods to measurement data (Type A evaluations as specified in the Guide to the Expression of Uncertainty in Measurement, GUM) comprise the following three main steps: (i) developing a statistical model that captures the pattern of dispersion or variability in the experimental data, and that relates the data either to the measurand directly or to some intermediate quantity (input quantity) that the measurand depends on; (ii) selecting a procedure for data reduction that is consistent with this model and that is fit for the purpose that the results are intended to serve; (iii) producing estimates of the model parameters, or predictions based on the fitted model, and evaluations of uncertainty that qualify either those estimates or these predictions, and that are suitable for use in subsequent uncertainty propagation exercises. We illustrate these steps in uncertainty evaluations related to the measurement of the mass fraction of vanadium in a bituminous coal reference material, including the assessment of the homogeneity of the material, and to the calibration and measurement of the amount-of-substance fraction of a hydrochlorofluorocarbon in air, and of the age of a meteorite. Our goal is to expose the plurality of choices that can reasonably be made when taking each of the three steps outlined above, and to show that different choices typically lead to different estimates of the quantities of interest, and to different evaluations of the associated uncertainty. In all the examples, the several alternatives considered represent choices that comparably competent statisticians might make, but who differ in the assumptions that they are prepared to rely on, and in their selection of approach to statistical inference. They represent also alternative treatments that the same statistician might give to the same data when the results are intended for different purposes.

  10. On the analysis of very small samples of Gaussian repeated measurements: an alternative approach.

    PubMed

    Westgate, Philip M; Burchett, Woodrow W

    2017-03-15

    The analysis of very small samples of Gaussian repeated measurements can be challenging. First, due to a very small number of independent subjects contributing outcomes over time, statistical power can be quite small. Second, nuisance covariance parameters must be appropriately accounted for in the analysis in order to maintain the nominal test size. However, available statistical strategies that ensure valid statistical inference may lack power, whereas more powerful methods may have the potential for inflated test sizes. Therefore, we explore an alternative approach to the analysis of very small samples of Gaussian repeated measurements, with the goal of maintaining valid inference while also improving statistical power relative to other valid methods. This approach uses generalized estimating equations with a bias-corrected empirical covariance matrix that accounts for all small-sample aspects of nuisance correlation parameter estimation in order to maintain valid inference. Furthermore, the approach utilizes correlation selection strategies with the goal of choosing the working structure that will result in the greatest power. In our study, we show that when accurate modeling of the nuisance correlation structure impacts the efficiency of regression parameter estimation, this method can improve power relative to existing methods that yield valid inference. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  11. An Inexpensive Bismuth-Petrolatum Dressing for Treatment of Burns

    PubMed Central

    Chattopadhyay, Arhana; Chang, Kathleen; Nguyen, Khoa; Galvez, Michael G.; Legrand, Anais; Davis, Christopher; McGoldrick, Rory; Long, Chao; Pham, Hung

    2016-01-01

    Background: Xeroform remains the current standard for treating superficial partial-thickness burns but can be prohibitively expensive in developing countries with prevalent burn injuries. This study (1) describes the production of an alternative low-cost dressing and (2) compares the alternative dressing and Xeroform using the metrics of cost-effectiveness, antimicrobial activity, and biocompatibility in vitro, and wound healing in vivo. Methods: To produce the alternative dressing, 3% bismuth tribromophenate powder was combined with petroleum jelly by hand and applied to Kerlix gauze. To assess cost-effectiveness, the unit costs of Xeroform and components of the alternative dressing were compared. To assess antimicrobial properties, the dressings were placed on agar plated with Escherichia coli and the Kirby-Bauer assay performed. To assess biocompatibility, the dressings were incubated with human dermal fibroblasts and cells stained with methylene blue. To assess in vivo wound healing, dressings were applied to excisional wounds on rats and the rate of re-epithelialization calculated. Results: The alternative dressing costs 34% of the least expensive brand of Xeroform. Antimicrobial assays showed that both dressings had similar bacteriostatic effects. Biocompatibility assays showed that there was no statistical difference (P < 0.05) in the cytotoxicity of Xeroform, alternative dressing, and Kerlix gauze. Finally, the in vivo healing model showed no statistical difference (P < 0.05) in mean re-epithelialization time between Xeroform (13.0 ± 1.6 days) and alternative dressing (13.5 ± 1.0 days). Conclusions: Xeroform is biocompatible, reduces infection, and enhances healing of burn wounds by preventing desiccation and mechanical trauma. Handmade petrolatum gauze may be a low-cost replacement for Xeroform. Future studies will focus on clinical trials in burn units. PMID:27482485

  12. The effects of context on multidimensional spatial cognitive models. Ph.D. Thesis - Arizona Univ.

    NASA Technical Reports Server (NTRS)

    Dupnick, E. G.

    1979-01-01

    Spatial cognitive models obtained by multidimensional scaling represent cognitive structure by defining alternatives as points in a coordinate space based on relevant dimensions such that interstimulus dissimilarities perceived by the individual correspond to distances between the respective alternatives. The dependence of spatial models on the context of the judgments required of the individual was investigated. Context, which is defined as a perceptual interpretation and cognitive understanding of a judgment situation, was analyzed and classified with respect to five characteristics: physical environment, social environment, task definition, individual perspective, and temporal setting. Four experiments designed to produce changes in the characteristics of context and to test the effects of these changes upon individual cognitive spaces are described with focus on experiment design, objectives, statistical analysis, results, and conclusions. The hypothesis is advanced that an individual can be characterized as having a master cognitive space for a set of alternatives. When the context changes, the individual appears to change the dimension weights to give a new spatial configuration. Factor analysis was used in the interpretation and labeling of cognitive space dimensions.

  13. The DSM-5 alternative model of personality disorders from the perspective of adult attachment: a study in community-dwelling adults.

    PubMed

    Fossati, Andrea; Krueger, Robert F; Markon, Kristian E; Borroni, Serena; Maffei, Cesare; Somma, Antonella

    2015-04-01

    To assess how the maladaptive personality domains and facets that were included in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) Alternative Model of Personality Disorders relate to adult attachment styles, 480 Italian nonclinical adults were administered the Personality Inventory for DSM-5 (PID-5) and the Attachment Style Questionnaire (ASQ). To evaluate the uniqueness of the associations between the PID-5 scales and the ASQ scales, the participants were also administered the Big Five Inventory (BFI). Multiple regression analyses showed that the ASQ scales significantly predicted both PID-5 domain scales and BFI scales; however, the relationships were different both qualitatively and quantitatively. With the exception of the PID-5 risk taking scale (adjusted R(2) = 0.02), all other PID-5 trait scales were significantly predicted by the ASQ scales, median adjusted R(2) value = 0.25, all ps < 0.001. Our findings suggest that the maladaptive personality domains and traits listed in the DSM-5 Alternative Model of Personality Disorders show meaningful associations with adult attachment styles.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nunes, Rafael C.; Abreu, Everton M.C.; Neto, Jorge Ananias

    Based on the relationship between thermodynamics and gravity we propose, with the aid of Verlinde's formalism, an alternative interpretation of the dynamical evolution of the Friedmann-Robertson-Walker Universe. This description takes into account the entropy and temperature intrinsic to the horizon of the universe due to the information holographically stored there through non-gaussian statistical theories proposed by Tsallis and Kaniadakis. The effect of these non-gaussian statistics in the cosmological context is to change the strength of the gravitational constant. In this paper, we consider the w CDM model modified by the non-gaussian statistics and investigate the compatibility of these non-gaussian modificationmore » with the cosmological observations. In order to analyze in which extend the cosmological data constrain these non-extensive statistics, we will use type Ia supernovae, baryon acoustic oscillations, Hubble expansion rate function and the linear growth of matter density perturbations data. We show that Tsallis' statistics is favored at 1σ confidence level.« less

  15. A new statistical approach to climate change detection and attribution

    NASA Astrophysics Data System (ADS)

    Ribes, Aurélien; Zwiers, Francis W.; Azaïs, Jean-Marc; Naveau, Philippe

    2017-01-01

    We propose here a new statistical approach to climate change detection and attribution that is based on additive decomposition and simple hypothesis testing. Most current statistical methods for detection and attribution rely on linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. Climate modelling uncertainty is difficult to take into account with regression based methods and is almost never treated explicitly. As an alternative to this approach, our statistical model is only based on the additivity assumption; the proposed method does not regress observations onto expected response patterns. We introduce estimation and testing procedures based on likelihood maximization, and show that climate modelling uncertainty can easily be accounted for. Some discussion is provided on how to practically estimate the climate modelling uncertainty based on an ensemble of opportunity. Our approach is based on the " models are statistically indistinguishable from the truth" paradigm, where the difference between any given model and the truth has the same distribution as the difference between any pair of models, but other choices might also be considered. The properties of this approach are illustrated and discussed based on synthetic data. Lastly, the method is applied to the linear trend in global mean temperature over the period 1951-2010. Consistent with the last IPCC assessment report, we find that most of the observed warming over this period (+0.65 K) is attributable to anthropogenic forcings (+0.67 ± 0.12 K, 90 % confidence range), with a very limited contribution from natural forcings (-0.01± 0.02 K).

  16. Statistical Hypothesis Testing in Intraspecific Phylogeography: NCPA versus ABC

    PubMed Central

    Templeton, Alan R.

    2009-01-01

    Nested clade phylogeographic analysis (NCPA) and approximate Bayesian computation (ABC) have been used to test phylogeographic hypotheses. Multilocus NCPA tests null hypotheses, whereas ABC discriminates among a finite set of alternatives. The interpretive criteria of NCPA are explicit and allow complex models to be built from simple components. The interpretive criteria of ABC are ad hoc and require the specification of a complete phylogeographic model. The conclusions from ABC are often influenced by implicit assumptions arising from the many parameters needed to specify a complex model. These complex models confound many assumptions so that biological interpretations are difficult. Sampling error is accounted for in NCPA, but ABC ignores important sources of sampling error that creates pseudo-statistical power. NCPA generates the full sampling distribution of its statistics, but ABC only yields local probabilities, which in turn make it impossible to distinguish between a good fitting model, a non-informative model, and an over-determined model. Both NCPA and ABC use approximations, but convergences of the approximations used in NCPA are well defined whereas those in ABC are not. NCPA can analyze a large number of locations, but ABC cannot. Finally, the dimensionality of tested hypothesis is known in NCPA, but not for ABC. As a consequence, the “probabilities” generated by ABC are not true probabilities and are statistically non-interpretable. Accordingly, ABC should not be used for hypothesis testing, but simulation approaches are valuable when used in conjunction with NCPA or other methods that do not rely on highly parameterized models. PMID:19192182

  17. Non-convex Statistical Optimization for Sparse Tensor Graphical Model

    PubMed Central

    Sun, Wei; Wang, Zhaoran; Liu, Han; Cheng, Guang

    2016-01-01

    We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work. Our theoretical results are backed by thorough numerical studies. PMID:28316459

  18. Nonlinear estimation of parameters in biphasic Arrhenius plots.

    PubMed

    Puterman, M L; Hrboticky, N; Innis, S M

    1988-05-01

    This paper presents a formal procedure for the statistical analysis of data on the thermotropic behavior of membrane-bound enzymes generated using the Arrhenius equation and compares the analysis to several alternatives. Data is modeled by a bent hyperbola. Nonlinear regression is used to obtain estimates and standard errors of the intersection of line segments, defined as the transition temperature, and slopes, defined as energies of activation of the enzyme reaction. The methodology allows formal tests of the adequacy of a biphasic model rather than either a single straight line or a curvilinear model. Examples on data concerning the thermotropic behavior of pig brain synaptosomal acetylcholinesterase are given. The data support the biphasic temperature dependence of this enzyme. The methodology represents a formal procedure for statistical validation of any biphasic data and allows for calculation of all line parameters with estimates of precision.

  19. Arctic curves in path models from the tangent method

    NASA Astrophysics Data System (ADS)

    Di Francesco, Philippe; Lapa, Matthew F.

    2018-04-01

    Recently, Colomo and Sportiello introduced a powerful method, known as the tangent method, for computing the arctic curve in statistical models which have a (non- or weakly-) intersecting lattice path formulation. We apply the tangent method to compute arctic curves in various models: the domino tiling of the Aztec diamond for which we recover the celebrated arctic circle; a model of Dyck paths equivalent to the rhombus tiling of a half-hexagon for which we find an arctic half-ellipse; another rhombus tiling model with an arctic parabola; the vertically symmetric alternating sign matrices, where we find the same arctic curve as for unconstrained alternating sign matrices. The latter case involves lattice paths that are non-intersecting but that are allowed to have osculating contact points, for which the tangent method was argued to still apply. For each problem we estimate the large size asymptotics of a certain one-point function using LU decomposition of the corresponding Gessel–Viennot matrices, and a reformulation of the result amenable to asymptotic analysis.

  20. Parametric regression model for survival data: Weibull regression model as an example

    PubMed Central

    2016-01-01

    Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846

  1. Nowcasting and Forecasting the Monthly Food Stamps Data in the US Using Online Search Data

    PubMed Central

    Fantazzini, Dean

    2014-01-01

    We propose the use of Google online search data for nowcasting and forecasting the number of food stamps recipients. We perform a large out-of-sample forecasting exercise with almost 3000 competing models with forecast horizons up to 2 years ahead, and we show that models including Google search data statistically outperform the competing models at all considered horizons. These results hold also with several robustness checks, considering alternative keywords, a falsification test, different out-of-samples, directional accuracy and forecasts at the state-level. PMID:25369315

  2. Full-Duplex Bidirectional Secure Communications Under Perfect and Distributionally Ambiguous Eavesdropper's CSI

    NASA Astrophysics Data System (ADS)

    Li, Qiang; Zhang, Ying; Lin, Jingran; Wu, Sissi Xiaoxiao

    2017-09-01

    Consider a full-duplex (FD) bidirectional secure communication system, where two communication nodes, named Alice and Bob, simultaneously transmit and receive confidential information from each other, and an eavesdropper, named Eve, overhears the transmissions. Our goal is to maximize the sum secrecy rate (SSR) of the bidirectional transmissions by optimizing the transmit covariance matrices at Alice and Bob. To tackle this SSR maximization (SSRM) problem, we develop an alternating difference-of-concave (ADC) programming approach to alternately optimize the transmit covariance matrices at Alice and Bob. We show that the ADC iteration has a semi-closed-form beamforming solution, and is guaranteed to converge to a stationary solution of the SSRM problem. Besides the SSRM design, this paper also deals with a robust SSRM transmit design under a moment-based random channel state information (CSI) model, where only some roughly estimated first and second-order statistics of Eve's CSI are available, but the exact distribution or other high-order statistics is not known. This moment-based error model is new and different from the widely used bounded-sphere error model and the Gaussian random error model. Under the consider CSI error model, the robust SSRM is formulated as an outage probability-constrained SSRM problem. By leveraging the Lagrangian duality theory and DC programming, a tractable safe solution to the robust SSRM problem is derived. The effectiveness and the robustness of the proposed designs are demonstrated through simulations.

  3. Statistical appearance models based on probabilistic correspondences.

    PubMed

    Krüger, Julia; Ehrhardt, Jan; Handels, Heinz

    2017-04-01

    Model-based image analysis is indispensable in medical image processing. One key aspect of building statistical shape and appearance models is the determination of one-to-one correspondences in the training data set. At the same time, the identification of these correspondences is the most challenging part of such methods. In our earlier work, we developed an alternative method using correspondence probabilities instead of exact one-to-one correspondences for a statistical shape model (Hufnagel et al., 2008). In this work, a new approach for statistical appearance models without one-to-one correspondences is proposed. A sparse image representation is used to build a model that combines point position and appearance information at the same time. Probabilistic correspondences between the derived multi-dimensional feature vectors are used to omit the need for extensive preprocessing of finding landmarks and correspondences as well as to reduce the dependence of the generated model on the landmark positions. Model generation and model fitting can now be expressed by optimizing a single global criterion derived from a maximum a-posteriori (MAP) approach with respect to model parameters that directly affect both shape and appearance of the considered objects inside the images. The proposed approach describes statistical appearance modeling in a concise and flexible mathematical framework. Besides eliminating the demand for costly correspondence determination, the method allows for additional constraints as topological regularity in the modeling process. In the evaluation the model was applied for segmentation and landmark identification in hand X-ray images. The results demonstrate the feasibility of the model to detect hand contours as well as the positions of the joints between finger bones for unseen test images. Further, we evaluated the model on brain data of stroke patients to show the ability of the proposed model to handle partially corrupted data and to demonstrate a possible employment of the correspondence probabilities to indicate these corrupted/pathological areas. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  5. Developmental Pathways from Parental Socioeconomic Status to Adolescent Substance Use: Alternative and Complementary Reinforcement.

    PubMed

    Lee, Jungeun Olivia; Cho, Junhan; Yoon, Yoewon; Bello, Mariel S; Khoddam, Rubin; Leventhal, Adam M

    2018-02-01

    Although lower socioeconomic status has been linked to increased youth substance use, much less research has determined potential mechanisms explaining the association. The current longitudinal study tested whether alternative (i.e., pleasure gained from activities without any concurrent use of substances) and complementary (i.e., pleasure gained from activities in tandem with substance use) reinforcement mediate the link between lower socioeconomic status and youth substance use. Further, we tested whether alternative and complementary reinforcement and youth substance use gradually unfold over time and then intersect with one another in a cascading manner. Potential sex differences are also examined. Data were drawn from a longitudinal survey of substance use and mental health among high school students in Los Angeles. Data collection involved four semiannual assessment waves beginning in fall 2013 (N = 3395; M baseline age = 14.1; 47% Hispanic, 16.2% Asian, 16.1% multiethnic, 15.7% White, and 5% Black; 53.4% female). The results from a negative binomial path model suggested that lower parental socioeconomic status (i.e., lower parental education) was significantly related to an increased number of substances used by youth. The final path model revealed that the inverse association was statistically mediated by adolescents' diminished engagement in pleasurable substance-free activities (i.e., alternative reinforcers) and elevated engagement in pleasurable activities paired with substance use (i.e., complementary reinforcers). The direct effect of lower parental education on adolescent substance use was not statistically significant after accounting for the hypothesized mediating mechanisms. No sex differences were detected. Increasing access to and engagement in pleasant activities of high quality that do not need a reinforcement enhancer, such as substances, may be useful in interrupting the link between lower parental socioeconomic status and youth substance use.

  6. Comparación de las predicciones de cosmologías alternativas al modelo estándar con datos del fondo cósmico de radiación

    NASA Astrophysics Data System (ADS)

    Piccirilli, M. P.; Landau, S. J.; León, G.

    2016-08-01

    The cosmic microwave background radiation is one of the most powerful tools to study the early Universe and its evolution, providing also a method to test different cosmological scenarios. We consider alternative inflationary models where the emergence of the seeds of cosmic structure from a perfect isotropic and homogeneous universe can be explained by the self-induced collapse of the inflaton wave function. Some of these alternative models may result indistinguishable from the standard model, while others require to be compared with observational data through statistical analysis. In this article we show results concerning the first Planck release, the Atacama Cosmology Telescope, the South Pole Telescope, the WMAP and Sloan Digital Sky Survey datasets, reaching good agreement between data and theoretical predictions. For future works, we aim to achieve better limits in the cosmological parameters using the last Planck release.

  7. Statistical Models for the Analysis of Zero-Inflated Pain Intensity Numeric Rating Scale Data.

    PubMed

    Goulet, Joseph L; Buta, Eugenia; Bathulapalli, Harini; Gueorguieva, Ralitza; Brandt, Cynthia A

    2017-03-01

    Pain intensity is often measured in clinical and research settings using the 0 to 10 numeric rating scale (NRS). NRS scores are recorded as discrete values, and in some samples they may display a high proportion of zeroes and a right-skewed distribution. Despite this, statistical methods for normally distributed data are frequently used in the analysis of NRS data. We present results from an observational cross-sectional study examining the association of NRS scores with patient characteristics using data collected from a large cohort of 18,935 veterans in Department of Veterans Affairs care diagnosed with a potentially painful musculoskeletal disorder. The mean (variance) NRS pain was 3.0 (7.5), and 34% of patients reported no pain (NRS = 0). We compared the following statistical models for analyzing NRS scores: linear regression, generalized linear models (Poisson and negative binomial), zero-inflated and hurdle models for data with an excess of zeroes, and a cumulative logit model for ordinal data. We examined model fit, interpretability of results, and whether conclusions about the predictor effects changed across models. In this study, models that accommodate zero inflation provided a better fit than the other models. These models should be considered for the analysis of NRS data with a large proportion of zeroes. We examined and analyzed pain data from a large cohort of veterans with musculoskeletal disorders. We found that many reported no current pain on the NRS on the diagnosis date. We present several alternative statistical methods for the analysis of pain intensity data with a large proportion of zeroes. Published by Elsevier Inc.

  8. Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.

    PubMed

    Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J

    2016-10-03

    Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.

  9. A pseudo-sequential choice model for valuing multi-attribute environmental policies or programs in contingent valuation applications

    Treesearch

    Dmitriy Volinskiy; John C Bergstrom; Christopher M Cornwell; Thomas P Holmes

    2010-01-01

    The assumption of independence of irrelevant alternatives in a sequential contingent valuation format should be questioned. Statistically, most valuation studies treat nonindependence as a consequence of unobserved individual effects. Another approach is to consider an inferential process in which any particular choice is part of a general choosing strategy of a survey...

  10. Reweighting anthropometric data using a nearest neighbour approach.

    PubMed

    Kumar, Kannan Anil; Parkinson, Matthew B

    2018-07-01

    When designing products and environments, detailed data on body size and shape are seldom available for the specific user population. One way to mitigate this issue is to reweight available data such that they provide an accurate estimate of the target population of interest. This is done by assigning a statistical weight to each individual in the reference data, increasing or decreasing their influence on statistical models of the whole. This paper presents a new approach to reweighting these data. Instead of stratified sampling, the proposed method uses a clustering algorithm to identify relationships between the detailed and reference populations using their height, mass, and body mass index (BMI). The newly weighted data are shown to provide more accurate estimates than traditional approaches. The improved accuracy that accompanies this method provides designers with an alternative to data synthesis techniques as they seek appropriate data to guide their design practice.Practitioner Summary: Design practice is best guided by data on body size and shape that accurately represents the target user population. This research presents an alternative to data synthesis (e.g. regression or proportionality constants) for adapting data from one population for use in modelling another.

  11. Time-course variation of statistics embedded in music: Corpus study on implicit learning and knowledge.

    PubMed

    Daikoku, Tatsuya

    2018-01-01

    Learning and knowledge of transitional probability in sequences like music, called statistical learning and knowledge, are considered implicit processes that occur without intention to learn and awareness of what one knows. This implicit statistical knowledge can be alternatively expressed via abstract medium such as musical melody, which suggests this knowledge is reflected in melodies written by a composer. This study investigates how statistics in music vary over a composer's lifetime. Transitional probabilities of highest-pitch sequences in Ludwig van Beethoven's Piano Sonata were calculated based on different hierarchical Markov models. Each interval pattern was ordered based on the sonata opus number. The transitional probabilities of sequential patterns that are musical universal in music gradually decreased, suggesting that time-course variations of statistics in music reflect time-course variations of a composer's statistical knowledge. This study sheds new light on novel methodologies that may be able to evaluate the time-course variation of composer's implicit knowledge using musical scores.

  12. Behavioral and Neural Signatures of Reduced Updating of Alternative Options in Alcohol-Dependent Patients during Flexible Decision-Making.

    PubMed

    Reiter, Andrea M F; Deserno, Lorenz; Kallert, Thomas; Heinze, Hans-Jochen; Heinz, Andreas; Schlagenhauf, Florian

    2016-10-26

    Addicted individuals continue substance use despite the knowledge of harmful consequences and often report having no choice but to consume. Computational psychiatry accounts have linked this clinical observation to difficulties in making flexible and goal-directed decisions in dynamic environments via consideration of potential alternative choices. To probe this in alcohol-dependent patients (n = 43) versus healthy volunteers (n = 35), human participants performed an anticorrelated decision-making task during functional neuroimaging. Via computational modeling, we investigated behavioral and neural signatures of inference regarding the alternative option. While healthy control subjects exploited the anticorrelated structure of the task to guide decision-making, alcohol-dependent patients were relatively better explained by a model-free strategy due to reduced inference on the alternative option after punishment. Whereas model-free prediction error signals were preserved, alcohol-dependent patients exhibited blunted medial prefrontal signatures of inference on the alternative option. This reduction was associated with patients' behavioral deficit in updating the alternative choice option and their obsessive-compulsive drinking habits. All results remained significant when adjusting for potential confounders (e.g., neuropsychological measures and gray matter density). A disturbed integration of alternative choice options implemented by the medial prefrontal cortex appears to be one important explanation for the puzzling question of why addicted individuals continue drug consumption despite negative consequences. In addiction, patients maintain substance use despite devastating consequences and often report having no choice but to consume. These clinical observations have been theoretically linked to disturbed mechanisms of inference, for example, to difficulties when learning statistical regularities of the environmental structure to guide decisions. Using computational modeling, we demonstrate disturbed inference on alternative choice options in alcohol addiction. Patients neglecting "what might have happened" was accompanied by blunted coding of inference regarding alternative choice options in the medial prefrontal cortex. An impaired integration of alternative choice options implemented by the medial prefrontal cortex might contribute to ongoing drug consumption in the face of evident negative consequences. Copyright © 2016 the authors 0270-6474/16/3610935-14$15.00/0.

  13. The prediction of intelligence in preschool children using alternative models to regression.

    PubMed

    Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E

    2011-12-01

    Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.

  14. Quantitative structure - mesothelioma potency model ...

    EPA Pesticide Factsheets

    Cancer potencies of mineral and synthetic elongated particle (EP) mixtures, including asbestos fibers, are influenced by changes in fiber dose composition, bioavailability, and biodurability in combination with relevant cytotoxic dose-response relationships. A unique and comprehensive rat intra-pleural (IP) dose characterization data set with a wide variety of EP size, shape, crystallographic, chemical, and bio-durability properties facilitated extensive statistical analyses of 50 rat IP exposure test results for evaluation of alternative dose pleural mesothelioma response models. Utilizing logistic regression, maximum likelihood evaluations of thousands of alternative dose metrics based on hundreds of individual EP dimensional variations within each test sample, four major findings emerged: (1) data for simulations of short-term EP dose changes in vivo (mild acid leaching) provide superior predictions of tumor incidence compared to non-acid leached data; (2) sum of the EP surface areas (ÓSA) from these mildly acid-leached samples provides the optimum holistic dose response model; (3) progressive removal of dose associated with very short and/or thin EPs significantly degrades resultant ÓEP or ÓSA dose-based predictive model fits, as judged by Akaike’s Information Criterion (AIC); and (4) alternative, biologically plausible model adjustments provide evidence for reduced potency of EPs with length/width (aspect) ratios 80 µm. Regar

  15. Examining the DSM-5 alternative personality disorder model operationalization of antisocial personality disorder and psychopathy in a male correctional sample.

    PubMed

    Wygant, Dustin B; Sellbom, Martin; Sleep, Chelsea E; Wall, Tina D; Applegate, Kathryn C; Krueger, Robert F; Patrick, Christopher J

    2016-07-01

    For decades, it has been known that the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnosis of Antisocial Personality Disorder (ASPD) is a nonadequate operationalization of psychopathy (Crego & Widiger, 2015). The DSM-5 alternative model of personality disorders provides an opportunity to rectify some of these long held concerns. The current study compared the Section III alternative model's trait-based conception of ASPD with the categorical model from the main diagnostic codes section of DSM-5 in terms of associations with differing models of psychopathy. We also evaluated the validity of the trait-based conception more broadly in relation to measures of antisocial tendencies as well as psychopathy. Participants were 200 male inmates who were administered a battery of self-report and interview-based researcher rating measures of relevant constructs. Analyses showed that Section III ASPD outperformed Section II ASPD in predicting scores on Hare's (2003) Psychopathy Checklist-Revised (PCL-R; r = .88 vs. .59). Additionally, aggregate scores for Section III ASPD performed well in capturing variance in differing ASPD and psychopathy measures. Finally, we found that the Section III ASPD impairment criteria added incrementally to the Section III ASPD traits in predicting PCL-R psychopathy and SCID-II ASPD. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  16. Nonlinear Hebbian Learning as a Unifying Principle in Receptive Field Formation.

    PubMed

    Brito, Carlos S N; Gerstner, Wulfram

    2016-09-01

    The development of sensory receptive fields has been modeled in the past by a variety of models including normative models such as sparse coding or independent component analysis and bottom-up models such as spike-timing dependent plasticity or the Bienenstock-Cooper-Munro model of synaptic plasticity. Here we show that the above variety of approaches can all be unified into a single common principle, namely nonlinear Hebbian learning. When nonlinear Hebbian learning is applied to natural images, receptive field shapes were strongly constrained by the input statistics and preprocessing, but exhibited only modest variation across different choices of nonlinearities in neuron models or synaptic plasticity rules. Neither overcompleteness nor sparse network activity are necessary for the development of localized receptive fields. The analysis of alternative sensory modalities such as auditory models or V2 development lead to the same conclusions. In all examples, receptive fields can be predicted a priori by reformulating an abstract model as nonlinear Hebbian learning. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities.

  17. Nonlinear Hebbian Learning as a Unifying Principle in Receptive Field Formation

    PubMed Central

    Gerstner, Wulfram

    2016-01-01

    The development of sensory receptive fields has been modeled in the past by a variety of models including normative models such as sparse coding or independent component analysis and bottom-up models such as spike-timing dependent plasticity or the Bienenstock-Cooper-Munro model of synaptic plasticity. Here we show that the above variety of approaches can all be unified into a single common principle, namely nonlinear Hebbian learning. When nonlinear Hebbian learning is applied to natural images, receptive field shapes were strongly constrained by the input statistics and preprocessing, but exhibited only modest variation across different choices of nonlinearities in neuron models or synaptic plasticity rules. Neither overcompleteness nor sparse network activity are necessary for the development of localized receptive fields. The analysis of alternative sensory modalities such as auditory models or V2 development lead to the same conclusions. In all examples, receptive fields can be predicted a priori by reformulating an abstract model as nonlinear Hebbian learning. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities. PMID:27690349

  18. Modeling and forecasting the distribution of Vibrio vulnificus in Chesapeake Bay

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jacobs, John M.; Rhodes, M.; Brown, C. W.

    The aim is to construct statistical models to predict the presence, abundance and potential virulence of Vibrio vulnificus in surface waters. A variety of statistical techniques were used in concert to identify water quality parameters associated with V. vulnificus presence, abundance and virulence markers in the interest of developing strong predictive models for use in regional oceanographic modeling systems. A suite of models are provided to represent the best model fit and alternatives using environmental variables that allow them to be put to immediate use in current ecological forecasting efforts. Conclusions: Environmental parameters such as temperature, salinity and turbidity aremore » capable of accurately predicting abundance and distribution of V. vulnificus in Chesapeake Bay. Forcing these empirical models with output from ocean modeling systems allows for spatially explicit forecasts for up to 48 h in the future. This study uses one of the largest data sets compiled to model Vibrio in an estuary, enhances our understanding of environmental correlates with abundance, distribution and presence of potentially virulent strains and offers a method to forecast these pathogens that may be replicated in other regions.« less

  19. A Statistical Framework for Protein Quantitation in Bottom-Up MS-Based Proteomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karpievitch, Yuliya; Stanley, Jeffrey R.; Taverner, Thomas

    2009-08-15

    Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level. Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate themore » methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives. Availability: The software has been made available in the opensource proteomics platform DAnTE (http://omics.pnl.gov/software/). Contact: adabney@stat.tamu.edu Supplementary information: Supplementary data are available at Bioinformatics online.« less

  20. An alternative way to evaluate chemistry-transport model variability

    NASA Astrophysics Data System (ADS)

    Menut, Laurent; Mailler, Sylvain; Bessagnet, Bertrand; Siour, Guillaume; Colette, Augustin; Couvidat, Florian; Meleux, Frédérik

    2017-03-01

    A simple and complementary model evaluation technique for regional chemistry transport is discussed. The methodology is based on the concept that we can learn about model performance by comparing the simulation results with observational data available for time periods other than the period originally targeted. First, the statistical indicators selected in this study (spatial and temporal correlations) are computed for a given time period, using colocated observation and simulation data in time and space. Second, the same indicators are used to calculate scores for several other years while conserving the spatial locations and Julian days of the year. The difference between the results provides useful insights on the model capability to reproduce the observed day-to-day and spatial variability. In order to synthesize the large amount of results, a new indicator is proposed, designed to compare several error statistics between all the years of validation and to quantify whether the period and area being studied were well captured by the model for the correct reasons.

  1. Statistical Validation of Surrogate Endpoints: Another Look at the Prentice Criterion and Other Criteria.

    PubMed

    Saraf, Sanatan; Mathew, Thomas; Roy, Anindya

    2015-01-01

    For the statistical validation of surrogate endpoints, an alternative formulation is proposed for testing Prentice's fourth criterion, under a bivariate normal model. In such a setup, the criterion involves inference concerning an appropriate regression parameter, and the criterion holds if the regression parameter is zero. Testing such a null hypothesis has been criticized in the literature since it can only be used to reject a poor surrogate, and not to validate a good surrogate. In order to circumvent this, an equivalence hypothesis is formulated for the regression parameter, namely the hypothesis that the parameter is equivalent to zero. Such an equivalence hypothesis is formulated as an alternative hypothesis, so that the surrogate endpoint is statistically validated when the null hypothesis is rejected. Confidence intervals for the regression parameter and tests for the equivalence hypothesis are proposed using bootstrap methods and small sample asymptotics, and their performances are numerically evaluated and recommendations are made. The choice of the equivalence margin is a regulatory issue that needs to be addressed. The proposed equivalence testing formulation is also adopted for other parameters that have been proposed in the literature on surrogate endpoint validation, namely, the relative effect and proportion explained.

  2. Weather extremes in very large, high-resolution ensembles: the weatherathome experiment

    NASA Astrophysics Data System (ADS)

    Allen, M. R.; Rosier, S.; Massey, N.; Rye, C.; Bowery, A.; Miller, J.; Otto, F.; Jones, R.; Wilson, S.; Mote, P.; Stone, D. A.; Yamazaki, Y. H.; Carrington, D.

    2011-12-01

    Resolution and ensemble size are often seen as alternatives in climate modelling. Models with sufficient resolution to simulate many classes of extreme weather cannot normally be run often enough to assess the statistics of rare events, still less how these statistics may be changing. As a result, assessments of the impact of external forcing on regional climate extremes must be based either on statistical downscaling from relatively coarse-resolution models, or statistical extrapolation from 10-year to 100-year events. Under the weatherathome experiment, part of the climateprediction.net initiative, we have compiled the Met Office Regional Climate Model HadRM3P to run on personal computer volunteered by the general public at 25 and 50km resolution, embedded within the HadAM3P global atmosphere model. With a global network of about 50,000 volunteers, this allows us to run time-slice ensembles of essentially unlimited size, exploring the statistics of extreme weather under a range of scenarios for surface forcing and atmospheric composition, allowing for uncertainty in both boundary conditions and model parameters. Current experiments, developed with the support of Microsoft Research, focus on three regions, the Western USA, Europe and Southern Africa. We initially simulate the period 1959-2010 to establish which variables are realistically simulated by the model and on what scales. Our next experiments are focussing on the Event Attribution problem, exploring how the probability of various types of extreme weather would have been different over the recent past in a world unaffected by human influence, following the design of Pall et al (2011), but extended to a longer period and higher spatial resolution. We will present the first results of the unique, global, participatory experiment and discuss the implications for the attribution of recent weather events to anthropogenic influence on climate.

  3. Huffman and linear scanning methods with statistical language models.

    PubMed

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  4. Simulating Metabolism with Statistical Thermodynamics

    PubMed Central

    Cannon, William R.

    2014-01-01

    New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed. PMID:25089525

  5. Simulating metabolism with statistical thermodynamics.

    PubMed

    Cannon, William R

    2014-01-01

    New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed.

  6. Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers

    ERIC Educational Resources Information Center

    Keiffer, Greggory L.; Lane, Forrest C.

    2016-01-01

    Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…

  7. Improved Statistics for Genome-Wide Interaction Analysis

    PubMed Central

    Ueki, Masao; Cordell, Heather J.

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670

  8. Why Current Statistics of Complementary Alternative Medicine Clinical Trials is Invalid.

    PubMed

    Pandolfi, Maurizio; Carreras, Giulia

    2018-06-07

    It is not sufficiently known that frequentist statistics cannot provide direct information on the probability that the research hypothesis tested is correct. The error resulting from this misunderstanding is compounded when the hypotheses under scrutiny have precarious scientific bases, which, generally, those of complementary alternative medicine (CAM) are. In such cases, it is mandatory to use inferential statistics, considering the prior probability that the hypothesis tested is true, such as the Bayesian statistics. The authors show that, under such circumstances, no real statistical significance can be achieved in CAM clinical trials. In this respect, CAM trials involving human material are also hardly defensible from an ethical viewpoint.

  9. Illustrating the practice of statistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamada, Christina A; Hamada, Michael S

    2009-01-01

    The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem andmore » incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.« less

  10. Cosmological Constraints from Fourier Phase Statistics

    NASA Astrophysics Data System (ADS)

    Ali, Kamran; Obreschkow, Danail; Howlett, Cullan; Bonvin, Camille; Llinares, Claudio; Oliveira Franco, Felipe; Power, Chris

    2018-06-01

    Most statistical inference from cosmic large-scale structure relies on two-point statistics, i.e. on the galaxy-galaxy correlation function (2PCF) or the power spectrum. These statistics capture the full information encoded in the Fourier amplitudes of the galaxy density field but do not describe the Fourier phases of the field. Here, we quantify the information contained in the line correlation function (LCF), a three-point Fourier phase correlation function. Using cosmological simulations, we estimate the Fisher information (at redshift z = 0) of the 2PCF, LCF and their combination, regarding the cosmological parameters of the standard ΛCDM model, as well as a Warm Dark Matter (WDM) model and the f(R) and Symmetron modified gravity models. The galaxy bias is accounted for at the level of a linear bias. The relative information of the 2PCF and the LCF depends on the survey volume, sampling density (shot noise) and the bias uncertainty. For a volume of 1h^{-3}Gpc^3, sampled with points of mean density \\bar{n} = 2× 10^{-3} h3 Mpc^{-3} and a bias uncertainty of 13%, the LCF improves the parameter constraints by about 20% in the ΛCDM cosmology and potentially even more in alternative models. Finally, since a linear bias only affects the Fourier amplitudes (2PCF), but not the phases (LCF), the combination of the 2PCF and the LCF can be used to break the degeneracy between the linear bias and σ8, present in 2-point statistics.

  11. Statistical iterative material image reconstruction for spectral CT using a semi-empirical forward model

    NASA Astrophysics Data System (ADS)

    Mechlem, Korbinian; Ehn, Sebastian; Sellerer, Thorsten; Pfeiffer, Franz; Noël, Peter B.

    2017-03-01

    In spectral computed tomography (spectral CT), the additional information about the energy dependence of attenuation coefficients can be exploited to generate material selective images. These images have found applications in various areas such as artifact reduction, quantitative imaging or clinical diagnosis. However, significant noise amplification on material decomposed images remains a fundamental problem of spectral CT. Most spectral CT algorithms separate the process of material decomposition and image reconstruction. Separating these steps is suboptimal because the full statistical information contained in the spectral tomographic measurements cannot be exploited. Statistical iterative reconstruction (SIR) techniques provide an alternative, mathematically elegant approach to obtaining material selective images with improved tradeoffs between noise and resolution. Furthermore, image reconstruction and material decomposition can be performed jointly. This is accomplished by a forward model which directly connects the (expected) spectral projection measurements and the material selective images. To obtain this forward model, detailed knowledge of the different photon energy spectra and the detector response was assumed in previous work. However, accurately determining the spectrum is often difficult in practice. In this work, a new algorithm for statistical iterative material decomposition is presented. It uses a semi-empirical forward model which relies on simple calibration measurements. Furthermore, an efficient optimization algorithm based on separable surrogate functions is employed. This partially negates one of the major shortcomings of SIR, namely high computational cost and long reconstruction times. Numerical simulations and real experiments show strongly improved image quality and reduced statistical bias compared to projection-based material decomposition.

  12. Sensitivity tests and ensemble hazard assessment for tephra fallout at Campi Flegrei, Italy

    NASA Astrophysics Data System (ADS)

    Selva, Jacopo; Costa, Antonio; De Natale, Giuseppe; Di Vito, Mauro; Isaia, Roberto; Macedonio, Giovanni

    2017-04-01

    We present the results of a statistical study on tephra dispersion in the case of reactivation of the Campi Flegrei volcano. We considered the full spectrum of possible eruptions, in terms of size and position of eruptive vents. To represent the spectrum of possible eruptive sizes, four classes of eruptions were considered. Of those only three are explosive (small, medium, and large) and can produce a significant quantity of volcanic ash. Hazard assessments are made through dispersion simulations of ash and lapilli, considering the full variability of winds, eruptive vents, and eruptive sizes. The results are presented in form of four families of hazard curves conditioned to the occurrence of an eruption: 1) small eruptive size from any vent; 2) medium eruptive size from any vent; 3) large eruptive size from any vent; 4) any size from any vent. The epistemic uncertainty (i.e. associated with the level of scientific knowledge of phenomena) on the estimation of hazard curves was quantified making use of alternative scientifically acceptable approaches. The choice of such alternative models is made after a comprehensive sensitivity analysis which considered different weather databases, alternative modelling of the possible opening of eruptive vents, tephra total grain-size distributions (TGSD), relative mass of fine particles, and the effect of aggregation. The results of this sensitivity analyses show that the dominant uncertainty is related to the choice of TGSD, mass of fine ash, and potential effects of ash aggregation. The latter is particularly relevant in case of magma-water interaction during an eruptive phase, when most of the fine ash can form accretionary lapilli that could contribute significantly in increasing the tephra load in the proximal region. Relatively insignificant is the variability induced by the use of different weather databases. The hazard curves, together with the quantification of epistemic uncertainty, were finally calculated through a statistical model based on ensemble mixing of selected alternative models, e.g. different choices on the estimate of the total erupted mass, mass of fine ash, effects of aggregation, etc. Hazard and probability maps were produced at different confidence levels compared to the epistemic uncertainty (mean, median, 16th percentile, and 84th percentile).

  13. Colloquium: Statistical mechanics of money, wealth, and income

    NASA Astrophysics Data System (ADS)

    Yakovenko, Victor M.; Rosser, J. Barkley, Jr.

    2009-10-01

    This Colloquium reviews statistical models for money, wealth, and income distributions developed in the econophysics literature since the late 1990s. By analogy with the Boltzmann-Gibbs distribution of energy in physics, it is shown that the probability distribution of money is exponential for certain classes of models with interacting economic agents. Alternative scenarios are also reviewed. Data analysis of the empirical distributions of wealth and income reveals a two-class distribution. The majority of the population belongs to the lower class, characterized by the exponential (“thermal”) distribution, whereas a small fraction of the population in the upper class is characterized by the power-law (“superthermal”) distribution. The lower part is very stable, stationary in time, whereas the upper part is highly dynamical and out of equilibrium.

  14. A nonparametric smoothing method for assessing GEE models with longitudinal binary data.

    PubMed

    Lin, Kuo-Chin; Chen, Yi-Ju; Shyr, Yu

    2008-09-30

    Studies involving longitudinal binary responses are widely applied in the health and biomedical sciences research and frequently analyzed by generalized estimating equations (GEE) method. This article proposes an alternative goodness-of-fit test based on the nonparametric smoothing approach for assessing the adequacy of GEE fitted models, which can be regarded as an extension of the goodness-of-fit test of le Cessie and van Houwelingen (Biometrics 1991; 47:1267-1282). The expectation and approximate variance of the proposed test statistic are derived. The asymptotic distribution of the proposed test statistic in terms of a scaled chi-squared distribution and the power performance of the proposed test are discussed by simulation studies. The testing procedure is demonstrated by two real data. Copyright (c) 2008 John Wiley & Sons, Ltd.

  15. Detecting Multiple Model Components with the Likelihood Ratio Test

    NASA Astrophysics Data System (ADS)

    Protassov, R. S.; van Dyk, D. A.

    2000-05-01

    The likelihood ratio test (LRT) and F-test popularized in astrophysics by Bevington (Data Reduction and Error Analysis in the Physical Sciences ) and Cash (1977, ApJ 228, 939), do not (even asymptotically) adhere to their nominal χ2 and F distributions in many statistical tests commonly used in astrophysics. The many legitimate uses of the LRT (see, e.g., the examples given in Cash (1977)) notwithstanding, it can be impossible to compute the false positive rate of the LRT or related tests such as the F-test. For example, although Cash (1977) did not suggest the LRT for detecting a line profile in a spectral model, it has become common practice despite the lack of certain required mathematical regularity conditions. Contrary to common practice, the nominal distribution of the LRT statistic should not be used in these situations. In this paper, we characterize an important class of problems where the LRT fails, show the non-standard behavior of the test in this setting, and provide a Bayesian alternative to the LRT, i.e., posterior predictive p-values. We emphasize that there are many legitimate uses of the LRT in astrophysics, and even when the LRT is inappropriate, there remain several statistical alternatives (e.g., judicious use of error bars and Bayes factors). We illustrate this point in our analysis of GRB 970508 that was studied by Piro et al. in ApJ, 514:L73-L77, 1999.

  16. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media.

    PubMed

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-06-21

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability-inherent to their nanoporosity-are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery.

  17. Null but not void: considerations for hypothesis testing.

    PubMed

    Shaw, Pamela A; Proschan, Michael A

    2013-01-30

    Standard statistical theory teaches us that once the null and alternative hypotheses have been defined for a parameter, the choice of the statistical test is clear. Standard theory does not teach us how to choose the null or alternative hypothesis appropriate to the scientific question of interest. Neither does it tell us that in some cases, depending on which alternatives are realistic, we may want to define our null hypothesis differently. Problems in statistical practice are frequently not as pristinely summarized as the classic theory in our textbooks. In this article, we present examples in statistical hypothesis testing in which seemingly simple choices are in fact rich with nuance that, when given full consideration, make the choice of the right hypothesis test much less straightforward. Published 2012. This article is a US Government work and is in the public domain in the USA.

  18. PCA as a practical indicator of OPLS-DA model reliability.

    PubMed

    Worley, Bradley; Powers, Robert

    Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.

  19. Relevance of the c-statistic when evaluating risk-adjustment models in surgery.

    PubMed

    Merkow, Ryan P; Hall, Bruce L; Cohen, Mark E; Dimick, Justin B; Wang, Edward; Chow, Warren B; Ko, Clifford Y; Bilimoria, Karl Y

    2012-05-01

    The measurement of hospital quality based on outcomes requires risk adjustment. The c-statistic is a popular tool used to judge model performance, but can be limited, particularly when evaluating specific operations in focused populations. Our objectives were to examine the interpretation and relevance of the c-statistic when used in models with increasingly similar case mix and to consider an alternative perspective on model calibration based on a graphical depiction of model fit. From the American College of Surgeons National Surgical Quality Improvement Program (2008-2009), patients were identified who underwent a general surgery procedure, and procedure groups were increasingly restricted: colorectal-all, colorectal-elective cases only, and colorectal-elective cancer cases only. Mortality and serious morbidity outcomes were evaluated using logistic regression-based risk adjustment, and model c-statistics and calibration curves were used to compare model performance. During the study period, 323,427 general, 47,605 colorectal-all, 39,860 colorectal-elective, and 21,680 colorectal cancer patients were studied. Mortality ranged from 1.0% in general surgery to 4.1% in the colorectal-all group, and serious morbidity ranged from 3.9% in general surgery to 12.4% in the colorectal-all procedural group. As case mix was restricted, c-statistics progressively declined from the general to the colorectal cancer surgery cohorts for both mortality and serious morbidity (mortality: 0.949 to 0.866; serious morbidity: 0.861 to 0.668). Calibration was evaluated graphically by examining predicted vs observed number of events over risk deciles. For both mortality and serious morbidity, there was no qualitative difference in calibration identified between the procedure groups. In the present study, we demonstrate how the c-statistic can become less informative and, in certain circumstances, can lead to incorrect model-based conclusions, as case mix is restricted and patients become more homogenous. Although it remains an important tool, caution is advised when the c-statistic is advanced as the sole measure of a model performance. Copyright © 2012 American College of Surgeons. All rights reserved.

  20. Efficient Posterior Probability Mapping Using Savage-Dickey Ratios

    PubMed Central

    Penny, William D.; Ridgway, Gerard R.

    2013-01-01

    Statistical Parametric Mapping (SPM) is the dominant paradigm for mass-univariate analysis of neuroimaging data. More recently, a Bayesian approach termed Posterior Probability Mapping (PPM) has been proposed as an alternative. PPM offers two advantages: (i) inferences can be made about effect size thus lending a precise physiological meaning to activated regions, (ii) regions can be declared inactive. This latter facility is most parsimoniously provided by PPMs based on Bayesian model comparisons. To date these comparisons have been implemented by an Independent Model Optimization (IMO) procedure which separately fits null and alternative models. This paper proposes a more computationally efficient procedure based on Savage-Dickey approximations to the Bayes factor, and Taylor-series approximations to the voxel-wise posterior covariance matrices. Simulations show the accuracy of this Savage-Dickey-Taylor (SDT) method to be comparable to that of IMO. Results on fMRI data show excellent agreement between SDT and IMO for second-level models, and reasonable agreement for first-level models. This Savage-Dickey test is a Bayesian analogue of the classical SPM-F and allows users to implement model comparison in a truly interactive manner. PMID:23533640

  1. On temporal stochastic modeling of precipitation, nesting models across scales

    NASA Astrophysics Data System (ADS)

    Paschalis, Athanasios; Molnar, Peter; Fatichi, Simone; Burlando, Paolo

    2014-01-01

    We analyze the performance of composite stochastic models of temporal precipitation which can satisfactorily reproduce precipitation properties across a wide range of temporal scales. The rationale is that a combination of stochastic precipitation models which are most appropriate for specific limited temporal scales leads to better overall performance across a wider range of scales than single models alone. We investigate different model combinations. For the coarse (daily) scale these are models based on Alternating renewal processes, Markov chains, and Poisson cluster models, which are then combined with a microcanonical Multiplicative Random Cascade model to disaggregate precipitation to finer (minute) scales. The composite models were tested on data at four sites in different climates. The results show that model combinations improve the performance in key statistics such as probability distributions of precipitation depth, autocorrelation structure, intermittency, reproduction of extremes, compared to single models. At the same time they remain reasonably parsimonious. No model combination was found to outperform the others at all sites and for all statistics, however we provide insight on the capabilities of specific model combinations. The results for the four different climates are similar, which suggests a degree of generality and wider applicability of the approach.

  2. Air pollution exposure prediction approaches used in air pollution epidemiology studies.

    PubMed

    Özkaynak, Halûk; Baxter, Lisa K; Dionisio, Kathie L; Burke, Janet

    2013-01-01

    Epidemiological studies of the health effects of outdoor air pollution have traditionally relied upon surrogates of personal exposures, most commonly ambient concentration measurements from central-site monitors. However, this approach may introduce exposure prediction errors and misclassification of exposures for pollutants that are spatially heterogeneous, such as those associated with traffic emissions (e.g., carbon monoxide, elemental carbon, nitrogen oxides, and particulate matter). We review alternative air quality and human exposure metrics applied in recent air pollution health effect studies discussed during the International Society of Exposure Science 2011 conference in Baltimore, MD. Symposium presenters considered various alternative exposure metrics, including: central site or interpolated monitoring data, regional pollution levels predicted using the national scale Community Multiscale Air Quality model or from measurements combined with local-scale (AERMOD) air quality models, hybrid models that include satellite data, statistically blended modeling and measurement data, concentrations adjusted by home infiltration rates, and population-based human exposure model (Stochastic Human Exposure and Dose Simulation, and Air Pollutants Exposure models) predictions. These alternative exposure metrics were applied in epidemiological applications to health outcomes, including daily mortality and respiratory hospital admissions, daily hospital emergency department visits, daily myocardial infarctions, and daily adverse birth outcomes. This paper summarizes the research projects presented during the symposium, with full details of the work presented in individual papers in this journal issue.

  3. Permutation tests for goodness-of-fit testing of mathematical models to experimental data.

    PubMed

    Fişek, M Hamit; Barlas, Zeynep

    2013-03-01

    This paper presents statistical procedures for improving the goodness-of-fit testing of theoretical models to data obtained from laboratory experiments. We use an experimental study in the expectation states research tradition which has been carried out in the "standardized experimental situation" associated with the program to illustrate the application of our procedures. We briefly review the expectation states research program and the fundamentals of resampling statistics as we develop our procedures in the resampling context. The first procedure we develop is a modification of the chi-square test which has been the primary statistical tool for assessing goodness of fit in the EST research program, but has problems associated with its use. We discuss these problems and suggest a procedure to overcome them. The second procedure we present, the "Average Absolute Deviation" test, is a new test and is proposed as an alternative to the chi square test, as being simpler and more informative. The third and fourth procedures are permutation versions of Jonckheere's test for ordered alternatives, and Kendall's tau(b), a rank order correlation coefficient. The fifth procedure is a new rank order goodness-of-fit test, which we call the "Deviation from Ideal Ranking" index, which we believe may be more useful than other rank order tests for assessing goodness-of-fit of models to experimental data. The application of these procedures to the sample data is illustrated in detail. We then present another laboratory study from an experimental paradigm different from the expectation states paradigm - the "network exchange" paradigm, and describe how our procedures may be applied to this data set. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. Effective model approach to the dense state of QCD matter

    NASA Astrophysics Data System (ADS)

    Fukushima, Kenji

    2011-12-01

    The first-principle approach to the dense state of QCD matter, i.e. the lattice-QCD simulation at finite baryon density, is not under theoretical control for the moment. The effective model study based on QCD symmetries is a practical alternative. However the model parameters that are fixed by hadronic properties in the vacuum may have unknown dependence on the baryon chemical potential. We propose a new prescription to constrain the effective model parameters by the matching condition with the thermal Statistical Model. In the transitional region where thermal quantities blow up in the Statistical Model, deconfined quarks and gluons should smoothly take over the relevant degrees of freedom from hadrons and resonances. We use the Polyakov-loop coupled Nambu-Jona-Lasinio (PNJL) model as an effective description in the quark side and show how the matching condition is satisfied by a simple ansäatz on the Polyakov loop potential. Our results favor a phase diagram with the chiral phase transition located at slightly higher temperature than deconfinement which stays close to the chemical freeze-out points.

  5. Bayesian demography 250 years after Bayes

    PubMed Central

    Bijak, Jakub; Bryant, John

    2016-01-01

    Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889

  6. Children's knowledge of the earth: a new methodological and statistical approach.

    PubMed

    Straatemeier, Marthe; van der Maas, Han L J; Jansen, Brenda R J

    2008-08-01

    In the field of children's knowledge of the earth, much debate has concerned the question of whether children's naive knowledge-that is, their knowledge before they acquire the standard scientific theory-is coherent (i.e., theory-like) or fragmented. We conducted two studies with large samples (N=328 and N=381) using a new paper-and-pencil test, denoted the EARTH (EArth Representation Test for cHildren), to discriminate between these two alternatives. We performed latent class analyses on the responses to the EARTH to test mental models associated with these alternatives. The naive mental models, as formulated by Vosniadou and Brewer, were not supported by the results. The results indicated that children's knowledge of the earth becomes more consistent as children grow older. These findings support the view that children's naive knowledge is fragmented.

  7. Using micro-simulation to investigate the safety impacts of transit design alternatives at signalized intersections.

    PubMed

    Li, Lu; Persaud, Bhagwant; Shalaby, Amer

    2017-03-01

    This study investigates the use of crash prediction models and micro-simulation to develop an effective surrogate safety assessment measure at the intersection level. With the use of these tools, hypothetical scenarios can be developed and explored to evaluate the safety impacts of design alternatives in a controlled environment, in which factors not directly associated with the design alternatives can be fixed. Micro-simulation models are developed, calibrated, and validated. Traffic conflicts in the micro-simulation models are estimated and linked with observed crash frequency, which greatly alleviates the lengthy time needed to collect sufficient crash data for evaluating alternatives, due to the rare and infrequent nature of crash events. A set of generalized linear models with negative binomial error structure is developed to correlate the simulated conflicts with the observed crash frequency in Toronto, Ontario, Canada. Crash prediction models are also developed for crashes of different impact types and for transit-involved crashes. The resulting statistical significance and the goodness-of-fit of the models suggest adequate predictive ability. Based on the established correlation between simulated conflicts and observed crashes, scenarios are developed in the micro-simulation models to investigate the safety effects of individual transit line elements by making hypothetical modifications to such elements and estimating changes in crash frequency from the resulting changes in conflicts. The findings imply that the existing transit signal priority schemes can have a negative effect on safety performance, and that the existing near-side stop positioning and streetcar transit type can be safer at their current state than if they were to be replaced by their respective counterparts. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Modelling parasite aggregation: disentangling statistical and ecological approaches.

    PubMed

    Yakob, Laith; Soares Magalhães, Ricardo J; Gray, Darren J; Milinovich, Gabriel; Wardrop, Nicola; Dunning, Rebecca; Barendregt, Jan; Bieri, Franziska; Williams, Gail M; Clements, Archie C A

    2014-05-01

    The overdispersion in macroparasite infection intensity among host populations is commonly simulated using a constant negative binomial aggregation parameter. We describe an alternative to utilising the negative binomial approach and demonstrate important disparities in intervention efficacy projections that can come about from opting for pattern-fitting models that are not process-explicit. We present model output in the context of the epidemiology and control of soil-transmitted helminths due to the significant public health burden imposed by these parasites, but our methods are applicable to other infections with demonstrable aggregation in parasite numbers among hosts. Copyright © 2014. Published by Elsevier Ltd.

  9. Acquisition and extinction in autoshaping.

    PubMed

    Kakade, Sham; Dayan, Peter

    2002-07-01

    C. R. Gallistel and J. Gibbon (2000) presented quantitative data on the speed with which animals acquire behavioral responses during autoshaping, together with a statistical model of learning intended to account for them. Although this model captures the form of the dependencies among critical variables, its detailed predictions are substantially at variance with the data. In the present article, further key data on the speed of acquisition are used to motivate an alternative model of learning, in which animals can be interpreted as paying different amounts of attention to stimuli according to estimates of their differential reliabilities as predictors.

  10. PAH concentrations simulated with the AURAMS-PAH chemical transport model over Canada and the USA

    NASA Astrophysics Data System (ADS)

    Galarneau, E.; Makar, P. A.; Zheng, Q.; Narayan, J.; Zhang, J.; Moran, M. D.; Bari, M. A.; Pathela, S.; Chen, A.; Chlumsky, R.

    2014-04-01

    The offline Eulerian AURAMS (A Unified Regional Air quality Modelling System) chemical transport model was adapted to simulate airborne concentrations of seven PAHs (polycyclic aromatic hydrocarbons): phenanthrene, anthracene, fluoranthene, pyrene, benz[a]anthracene, chrysene + triphenylene, and benzo[a]pyrene. The model was then run for the year 2002 with hourly output on a grid covering southern Canada and the continental USA with 42 km horizontal grid spacing. Model predictions were compared to ~5000 24 h-average PAH measurements from 45 sites, most of which were located in urban or industrial areas. Eight of the measurement sites also provided data on particle/gas partitioning which had been modelled using two alternative schemes. This is the first known regional modelling study for PAHs over a North American domain and the first modelling study at any scale to compare alternative particle/gas partitioning schemes against paired field measurements. The goal of the study was to provide output concentration maps of use to assessing human inhalation exposure to PAHs in ambient air. Annual average modelled total (gas + particle) concentrations were statistically indistinguishable from measured values for fluoranthene, pyrene and benz[a]anthracene whereas the model underestimated concentrations of phenanthrene, anthracene and chrysene + triphenylene. Significance for benzo[a]pyrene performance was close to the statistical threshold and depended on the particle/gas partitioning scheme employed. On a day-to-day basis, the model simulated total PAH concentrations to the correct order of magnitude the majority of the time. The model showed seasonal differences in prediction quality for volatile species which suggests that a missing emission source such as air-surface exchange should be included in future versions. Model performance differed substantially between measurement locations and the limited available evidence suggests that the model's spatial resolution was too coarse to capture the distribution of concentrations in densely populated areas. A more detailed analysis of the factors influencing modelled particle/gas partitioning is warranted based on the findings in this study.

  11. Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool.

    PubMed

    Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi

    2007-10-01

    Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.

  12. f( R) gravity modifications: from the action to the data

    NASA Astrophysics Data System (ADS)

    Lazkoz, Ruth; Ortiz-Baños, María; Salzano, Vincenzo

    2018-03-01

    It is a very well established matter nowadays that many modified gravity models can offer a sound alternative to General Relativity for the description of the accelerated expansion of the universe. But it is also equally well known that no clear and sharp discrimination between any alternative theory and the classical one has been found so far. In this work, we attempt at formulating a different approach starting from the general class of f( R) theories as test probes: we try to reformulate f( R) Lagrangian terms as explicit functions of the redshift, i.e., as f( z). In this context, the f( R) setting to the consensus cosmological model, the Λ CDM model, can be written as a polynomial including just a constant and a third-order term. Starting from this result, we propose various different polynomial parameterizations f( z), including new terms which would allow for deviations from Λ CDM, and we thoroughly compare them with observational data. While on the one hand we have found no statistically preference for our proposals (even if some of them are as good as Λ CDM by using Bayesian Evidence comparison), we think that our novel approach could provide a different perspective for the development of new and observationally reliable alternative models of gravity.

  13. Cosmological constraints from a joint analysis of cosmic growth and expansion

    NASA Astrophysics Data System (ADS)

    Moresco, M.; Marulli, F.

    2017-10-01

    Combining measurements on the expansion history of the Universe and on the growth rate of cosmic structures is key to discriminate between alternative cosmological frameworks and to test gravity. Recently, Linder proposed a new diagram to investigate the joint evolutionary track of these two quantities. In this letter, we collect the most recent cosmic growth and expansion rate data sets to provide the state-of-the-art observational constraints on this diagram. By performing a joint statistical analysis of both probes, we test the standard Λcold dark matter model, confirming a mild tension between cosmic microwave background predictions from Planck mission and cosmic growth measurements at low redshift (z < 2). Then we test alternative models allowing the variation of one single cosmological parameter at a time. In particular, we find a larger growth index than the one predicted by general relativity γ =0.65^{+0.05}_{-0.04}. However, also a standard model with total neutrino mass of 0.26 ± 0.10 eV provides a similarly accurate description of the current data. By simulating an additional data set consistent with next-generation dark-energy mission forecasts, we show that growth rate constraints at z > 1 will be crucial to discriminate between alternative models.

  14. A less field-intensive robust design for estimating demographic parameters with Mark-resight data

    USGS Publications Warehouse

    McClintock, B.T.; White, Gary C.

    2009-01-01

    The robust design has become popular among animal ecologists as a means for estimating population abundance and related demographic parameters with mark-recapture data. However, two drawbacks of traditional mark-recapture are financial cost and repeated disturbance to animals. Mark-resight methodology may in many circumstances be a less expensive and less invasive alternative to mark-recapture, but the models developed to date for these data have overwhelmingly concentrated only on the estimation of abundance. Here we introduce a mark-resight model analogous to that used in mark-recapture for the simultaneous estimation of abundance, apparent survival, and transition probabilities between observable and unobservable states. The model may be implemented using standard statistical computing software, but it has also been incorporated into the freeware package Program MARK. We illustrate the use of our model with mainland New Zealand Robin (Petroica australis) data collected to ascertain whether this methodology may be a reliable alternative for monitoring endangered populations of a closely related species inhabiting the Chatham Islands. We found this method to be a viable alternative to traditional mark-recapture when cost or disturbance to species is of particular concern in long-term population monitoring programs. ?? 2009 by the Ecological Society of America.

  15. Characterization and classification of oral tissues using excitation and emission matrix: a statistical modeling approach

    NASA Astrophysics Data System (ADS)

    Kanniyappan, Udayakumar; Gnanatheepaminstein, Einstein; Prakasarao, Aruna; Dornadula, Koteeswaran; Singaravelu, Ganesan

    2017-02-01

    Cancer is one of the most common human threats around the world and diagnosis based on optical spectroscopy especially fluorescence technique has been established as the standard approach among scientist to explore the biochemical and morphological changes in tissues. In this regard, the present work aims to extract spectral signatures of the various fluorophores present in oral tissues using parallel factor analysis (PARAFAC). Subsequently, the statistical analysis also to be performed to show its diagnostic potential in distinguishing malignant, premalignant from normal oral tissues. Hence, the present study may lead to the possible and/or alternative tool for oral cancer diagnosis.

  16. A statistical model for interpreting computerized dynamic posturography data

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan H.; Metter, E. Jeffrey; Paloski, William H.

    2002-01-01

    Computerized dynamic posturography (CDP) is widely used for assessment of altered balance control. CDP trials are quantified using the equilibrium score (ES), which ranges from zero to 100, as a decreasing function of peak sway angle. The problem of how best to model and analyze ESs from a controlled study is considered. The ES often exhibits a skewed distribution in repeated trials, which can lead to incorrect inference when applying standard regression or analysis of variance models. Furthermore, CDP trials are terminated when a patient loses balance. In these situations, the ES is not observable, but is assigned the lowest possible score--zero. As a result, the response variable has a mixed discrete-continuous distribution, further compromising inference obtained by standard statistical methods. Here, we develop alternative methodology for analyzing ESs under a stochastic model extending the ES to a continuous latent random variable that always exists, but is unobserved in the event of a fall. Loss of balance occurs conditionally, with probability depending on the realized latent ES. After fitting the model by a form of quasi-maximum-likelihood, one may perform statistical inference to assess the effects of explanatory variables. An example is provided, using data from the NIH/NIA Baltimore Longitudinal Study on Aging.

  17. pyblocxs: Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

    NASA Astrophysics Data System (ADS)

    Siemiginowska, A.; Kashyap, V.; Refsdal, B.; van Dyk, D.; Connors, A.; Park, T.

    2011-07-01

    Typical X-ray spectra have low counts and should be modeled using the Poisson distribution. However, χ2 statistic is often applied as an alternative and the data are assumed to follow the Gaussian distribution. A variety of weights to the statistic or a binning of the data is performed to overcome the low counts issues. However, such modifications introduce biases or/and a loss of information. Standard modeling packages such as XSPEC and Sherpa provide the Poisson likelihood and allow computation of rudimentary MCMC chains, but so far do not allow for setting a full Bayesian model. We have implemented a sophisticated Bayesian MCMC-based algorithm to carry out spectral fitting of low counts sources in the Sherpa environment. The code is a Python extension to Sherpa and allows to fit a predefined Sherpa model to high-energy X-ray spectral data and other generic data. We present the algorithm and discuss several issues related to the implementation, including flexible definition of priors and allowing for variations in the calibration information.

  18. Using simulation modeling to improve patient flow at an outpatient orthopedic clinic.

    PubMed

    Rohleder, Thomas R; Lewkonia, Peter; Bischak, Diane P; Duffy, Paul; Hendijani, Rosa

    2011-06-01

    We report on the use of discrete event simulation modeling to support process improvements at an orthopedic outpatient clinic. The clinic was effective in treating patients, but waiting time and congestion in the clinic created patient dissatisfaction and staff morale issues. The modeling helped to identify improvement alternatives including optimized staffing levels, better patient scheduling, and an emphasis on staff arriving promptly. Quantitative results from the modeling provided motivation to implement the improvements. Statistical analysis of data taken before and after the implementation indicate that waiting time measures were significantly improved and overall patient time in the clinic was reduced.

  19. Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    We introduce a framework to build a survival/risk bump hunting model with a censored time-to-event response. Our Survival Bump Hunting (SBH) method is based on a recursive peeling procedure that uses a specific survival peeling criterion derived from non/semi-parametric statistics such as the hazards-ratio, the log-rank test or the Nelson--Aalen estimator. To optimize the tuning parameter of the model and validate it, we introduce an objective function based on survival or prediction-error statistics, such as the log-rank test and the concordance error rate. We also describe two alternative cross-validation techniques adapted to the joint task of decision-rule making by recursive peeling and survival estimation. Numerical analyses show the importance of replicated cross-validation and the differences between criteria and techniques in both low and high-dimensional settings. Although several non-parametric survival models exist, none addresses the problem of directly identifying local extrema. We show how SBH efficiently estimates extreme survival/risk subgroups unlike other models. This provides an insight into the behavior of commonly used models and suggests alternatives to be adopted in practice. Finally, our SBH framework was applied to a clinical dataset. In it, we identified subsets of patients characterized by clinical and demographic covariates with a distinct extreme survival outcome, for which tailored medical interventions could be made. An R package PRIMsrc (Patient Rule Induction Method in Survival, Regression and Classification settings) is available on CRAN (Comprehensive R Archive Network) and GitHub. PMID:27034730

  20. An empiric estimate of the value of life: updating the renal dialysis cost-effectiveness standard.

    PubMed

    Lee, Chris P; Chertow, Glenn M; Zenios, Stefanos A

    2009-01-01

    Proposals to make decisions about coverage of new technology by comparing the technology's incremental cost-effectiveness with the traditional benchmark of dialysis imply that the incremental cost-effectiveness ratio of dialysis is seen a proxy for the value of a statistical year of life. The frequently used ratio for dialysis has, however, not been updated to reflect more recently available data on dialysis. We developed a computer simulation model for the end-stage renal disease population and compared cost, life expectancy, and quality adjusted life expectancy of current dialysis practice relative to three less costly alternatives and to no dialysis. We estimated incremental cost-effectiveness ratios for these alternatives relative to the next least costly alternative and no dialysis and analyzed the population distribution of the ratios. Model parameters and costs were estimated using data from the Medicare population and a large integrated health-care delivery system between 1996 and 2003. The sensitivity of results to model assumptions was tested using 38 scenarios of one-way sensitivity analysis, where parameters informing the cost, utility, mortality and morbidity, etc. components of the model were by perturbed +/-50%. The incremental cost-effectiveness ratio of dialysis of current practice relative to the next least costly alternative is on average $129,090 per quality-adjusted life-year (QALY) ($61,294 per year), but its distribution within the population is wide; the interquartile range is $71,890 per QALY, while the 1st and 99th percentiles are $65,496 and $488,360 per QALY, respectively. Higher incremental cost-effectiveness ratios were associated with older age and more comorbid conditions. Sensitivity to model parameters was comparatively small, with most of the scenarios leading to a change of less than 10% in the ratio. The value of a statistical year of life implied by dialysis practice currently averages $129,090 per QALY ($61,294 per year), but is distributed widely within the dialysis population. The spread suggests that coverage decisions using dialysis as the benchmark may need to incorporate percentile values (which are higher than the average) to be consistent with the Rawlsian principles of justice of preserving the rights and interests of society's most vulnerable patient groups.

  1. Data driven propulsion system weight prediction model

    NASA Astrophysics Data System (ADS)

    Gerth, Richard J.

    1994-10-01

    The objective of the research was to develop a method to predict the weight of paper engines, i.e., engines that are in the early stages of development. The impetus for the project was the Single Stage To Orbit (SSTO) project, where engineers need to evaluate alternative engine designs. Since the SSTO is a performance driven project the performance models for alternative designs were well understood. The next tradeoff is weight. Since it is known that engine weight varies with thrust levels, a model is required that would allow discrimination between engines that produce the same thrust. Above all, the model had to be rooted in data with assumptions that could be justified based on the data. The general approach was to collect data on as many existing engines as possible and build a statistical model of the engines weight as a function of various component performance parameters. This was considered a reasonable level to begin the project because the data would be readily available, and it would be at the level of most paper engines, prior to detailed component design.

  2. Validation of the alternating conditional estimation algorithm for estimation of flexible extensions of Cox's proportional hazards model with nonlinear constraints on the parameters.

    PubMed

    Wynant, Willy; Abrahamowicz, Michal

    2016-11-01

    Standard optimization algorithms for maximizing likelihood may not be applicable to the estimation of those flexible multivariable models that are nonlinear in their parameters. For applications where the model's structure permits separating estimation of mutually exclusive subsets of parameters into distinct steps, we propose the alternating conditional estimation (ACE) algorithm. We validate the algorithm, in simulations, for estimation of two flexible extensions of Cox's proportional hazards model where the standard maximum partial likelihood estimation does not apply, with simultaneous modeling of (1) nonlinear and time-dependent effects of continuous covariates on the hazard, and (2) nonlinear interaction and main effects of the same variable. We also apply the algorithm in real-life analyses to estimate nonlinear and time-dependent effects of prognostic factors for mortality in colon cancer. Analyses of both simulated and real-life data illustrate good statistical properties of the ACE algorithm and its ability to yield new potentially useful insights about the data structure. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Modeling Rabbit Responses to Single and Multiple Aerosol ...

    EPA Pesticide Factsheets

    Journal Article Survival models are developed here to predict response and time-to-response for mortality in rabbits following exposures to single or multiple aerosol doses of Bacillus anthracis spores. Hazard function models were developed for a multiple dose dataset to predict the probability of death through specifying dose-response functions and the time between exposure and the time-to-death (TTD). Among the models developed, the best-fitting survival model (baseline model) has an exponential dose-response model with a Weibull TTD distribution. Alternative models assessed employ different underlying dose-response functions and use the assumption that, in a multiple dose scenario, earlier doses affect the hazard functions of each subsequent dose. In addition, published mechanistic models are analyzed and compared with models developed in this paper. None of the alternative models that were assessed provided a statistically significant improvement in fit over the baseline model. The general approach utilizes simple empirical data analysis to develop parsimonious models with limited reliance on mechanistic assumptions. The baseline model predicts TTDs consistent with reported results from three independent high-dose rabbit datasets. More accurate survival models depend upon future development of dose-response datasets specifically designed to assess potential multiple dose effects on response and time-to-response. The process used in this paper to dev

  4. Statistical tools for transgene copy number estimation based on real-time PCR.

    PubMed

    Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

    2007-11-01

    As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.

  5. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts.

    PubMed

    Preisser, John S; Long, D Leann; Stamm, John W

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. © 2017 S. Karger AG, Basel.

  6. Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts

    PubMed Central

    Preisser, John S.; Long, D. Leann; Stamm, John W.

    2017-01-01

    Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two datasets, one consisting of fictional dmft counts in two groups and the other on DMFS among schoolchildren from a randomized clinical trial (RCT) comparing three toothpaste formulations to prevent incident dental caries, are analysed with negative binomial hurdle (NBH), zero-inflated negative binomial (ZINB), and marginalized zero-inflated negative binomial (MZINB) models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the RCT were similar despite their distinctive interpretations. Choice of statistical model class should match the study’s purpose, while accounting for the broad decline in children’s caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. PMID:28291962

  7. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed Central

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-01-01

    Background Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Results Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. Conclusion In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects. PMID:12962547

  8. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-09-08

    Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects.

  9. Visual preference and ecological assessments for designed alternative brownfield rehabilitations.

    PubMed

    Lafortezza, Raffaele; Corry, Robert C; Sanesi, Giovanni; Brown, Robert D

    2008-11-01

    This paper describes an integrative method for quantifying, analyzing, and comparing the effects of alternative rehabilitation approaches with visual preference. The method was applied to a portion of a major industrial area located in southern Italy. Four alternative approaches to rehabilitation (alternative designs) were developed and analyzed. The scenarios consisted of the cleanup of the brownfields plus: (1) the addition of ground cover species; (2) the addition of ground cover species and a few trees randomly distributed; (3) the addition of ground cover species and a few trees in small groups; and (4) the addition of ground cover species and several trees in large groups. The approaches were analyzed and compared to the baseline condition through the use of cost-surface modeling (CSM) and visual preference assessment (VPA). Statistical results showed that alternatives that were more ecologically functional for forest bird species dispersal were also more visually preferable. Some differences were identified based on user groups and location of residence. The results of the study are used to identify implications for enhancing both ecological attributes and visual preferences of rehabilitating landscapes through planning and design.

  10. Effectiveness of an Alternative Dental Workforce Model on the Oral Health of Low-Income Children in a School-Based Setting

    PubMed Central

    Walker, Mary; Gadbury-Amyot, Cynthia; Liu, Ying; Kelly, Patricia; Branson, Bonnie

    2015-01-01

    Objectives. We evaluated the effect of an alternative dental workforce program—Kansas’s Extended Care Permit (ECP) program—as a function of changes in oral health. Methods. We examined data from the 2008 to 2012 electronic medical records of children (n = 295) in a Midwestern US suburb who participated in a school-based oral health program in which preventive oral health care was delivered by ECP dental hygienists. We examined changes in oral health status as a function of sealants, caries, restorations, and treatment urgency with descriptive statistics, multivariate analysis of variance, Kruskal–Wallis test, and Pearson correlations. Results. The number of encounters with the ECP dental hygienist had a statistically significant effect on changes in decay (P = .014), restorations (P = .002), and treatment urgency (P = .038). Based on Pearson correlations, as encounters increased, there was a significant decrease in decay (–0.12), increase in restorations (0.21), and decrease in treatment urgency (–0.15). Conclusions. Increasing numbers of encounters with alternative providers (ECP dental hygienists), such as with school-based oral health programs, can improve the oral health status of low-income children who would not otherwise have received oral health services. PMID:26180957

  11. Statistical analysis of particle trajectories in living cells

    NASA Astrophysics Data System (ADS)

    Briane, Vincent; Kervrann, Charles; Vimond, Myriam

    2018-06-01

    Recent advances in molecular biology and fluorescence microscopy imaging have made possible the inference of the dynamics of molecules in living cells. Such inference allows us to understand and determine the organization and function of the cell. The trajectories of particles (e.g., biomolecules) in living cells, computed with the help of object tracking methods, can be modeled with diffusion processes. Three types of diffusion are considered: (i) free diffusion, (ii) subdiffusion, and (iii) superdiffusion. The mean-square displacement (MSD) is generally used to discriminate the three types of particle dynamics. We propose here a nonparametric three-decision test as an alternative to the MSD method. The rejection of the null hypothesis, i.e., free diffusion, is accompanied by claims of the direction of the alternative (subdiffusion or superdiffusion). We study the asymptotic behavior of the test statistic under the null hypothesis and under parametric alternatives which are currently considered in the biophysics literature. In addition, we adapt the multiple-testing procedure of Benjamini and Hochberg to fit with the three-decision-test setting, in order to apply the test procedure to a collection of independent trajectories. The performance of our procedure is much better than the MSD method as confirmed by Monte Carlo experiments. The method is demonstrated on real data sets corresponding to protein dynamics observed in fluorescence microscopy.

  12. A practical approach for the scale-up of roller compaction process.

    PubMed

    Shi, Weixian; Sprockel, Omar L

    2016-09-01

    An alternative approach for the scale-up of ribbon formation during roller compaction was investigated, which required only one batch at the commercial scale to set the operational conditions. The scale-up of ribbon formation was based on a probability method. It was sufficient in describing the mechanism of ribbon formation at both scales. In this method, a statistical relationship between roller compaction parameters and ribbon attributes (thickness and density) was first defined with DoE using a pilot Alexanderwerk WP120 roller compactor. While the milling speed was included in the design, it has no practical effect on granule properties within the study range despite its statistical significance. The statistical relationship was then adapted to a commercial Alexanderwerk WP200 roller compactor with one experimental run. The experimental run served as a calibration of the statistical model parameters. The proposed transfer method was then confirmed by conducting a mapping study on the Alexanderwerk WP200 using a factorial DoE, which showed a match between the predictions and the verification experiments. The study demonstrates the applicability of the roller compaction transfer method using the statistical model from the development scale calibrated with one experiment point at the commercial scale. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Genomic Prediction Accounting for Genotype by Environment Interaction Offers an Effective Framework for Breeding Simultaneously for Adaptation to an Abiotic Stress and Performance Under Normal Cropping Conditions in Rice.

    PubMed

    Ben Hassen, Manel; Bartholomé, Jérôme; Valè, Giampiero; Cao, Tuong-Vi; Ahmadi, Nourollah

    2018-05-09

    Developing rice varieties adapted to alternate wetting and drying water management is crucial for the sustainability of irrigated rice cropping systems. Here we report the first study exploring the feasibility of breeding rice for adaptation to alternate wetting and drying using genomic prediction methods that account for genotype by environment interactions. Two breeding populations (a reference panel of 284 accessions and a progeny population of 97 advanced lines) were evaluated under alternate wetting and drying and continuous flooding management systems. The predictive ability of genomic prediction for response variables (index of relative performance and the slope of the joint regression) and for multi-environment genomic prediction models were compared. For the three traits considered (days to flowering, panicle weight and nitrogen-balance index), significant genotype by environment interactions were observed in both populations. In cross validation, predictive ability for the index was on average lower (0.31) than that of the slope of the joint regression (0.64) whatever the trait considered. Similar results were found for progeny validation. Both cross-validation and progeny validation experiments showed that the performance of multi-environment models predicting unobserved phenotypes of untested entrees was similar to the performance of single environment models with differences in predictive ability ranging from -6% to 4% depending on the trait and on the statistical model concerned. The predictive ability of multi-environment models predicting unobserved phenotypes of entrees evaluated under both water management systems outperformed single environment models by an average of 30%. Practical implications for breeding rice for adaptation to alternate wetting and drying system are discussed. Copyright © 2018, G3: Genes, Genomes, Genetics.

  14. A sup-score test for the cure fraction in mixture models for long-term survivors.

    PubMed

    Hsu, Wei-Wen; Todem, David; Kim, KyungMann

    2016-12-01

    The evaluation of cure fractions in oncology research under the well known cure rate model has attracted considerable attention in the literature, but most of the existing testing procedures have relied on restrictive assumptions. A common assumption has been to restrict the cure fraction to a constant under alternatives to homogeneity, thereby neglecting any information from covariates. This article extends the literature by developing a score-based statistic that incorporates covariate information to detect cure fractions, with the existing testing procedure serving as a special case. A complication of this extension, however, is that the implied hypotheses are not typical and standard regularity conditions to conduct the test may not even hold. Using empirical processes arguments, we construct a sup-score test statistic for cure fractions and establish its limiting null distribution as a functional of mixtures of chi-square processes. In practice, we suggest a simple resampling procedure to approximate this limiting distribution. Our simulation results show that the proposed test can greatly improve efficiency over tests that neglect the heterogeneity of the cure fraction under the alternative. The practical utility of the methodology is illustrated using ovarian cancer survival data with long-term follow-up from the surveillance, epidemiology, and end results registry. © 2016, The International Biometric Society.

  15. Spatio-temporal Eigenvector Filtering: Application on Bioenergy Crop Impacts

    NASA Astrophysics Data System (ADS)

    Wang, M.; Kamarianakis, Y.; Georgescu, M.

    2017-12-01

    A suite of 10-year ensemble-based simulations was conducted to investigate the hydroclimatic impacts due to large-scale deployment of perennial bioenergy crops across the continental United States. Given the large size of the simulated dataset (about 60Tb), traditional hierarchical spatio-temporal statistical modelling cannot be implemented for the evaluation of physics parameterizations and biofuel impacts. In this work, we propose a filtering algorithm that takes into account the spatio-temporal autocorrelation structure of the data while avoiding spatial confounding. This method is used to quantify the robustness of simulated hydroclimatic impacts associated with bioenergy crops to alternative physics parameterizations and observational datasets. Results are evaluated against those obtained from three alternative Bayesian spatio-temporal specifications.

  16. Pattern-Based Inverse Modeling for Characterization of Subsurface Flow Models with Complex Geologic Heterogeneity

    NASA Astrophysics Data System (ADS)

    Golmohammadi, A.; Jafarpour, B.; M Khaninezhad, M. R.

    2017-12-01

    Calibration of heterogeneous subsurface flow models leads to ill-posed nonlinear inverse problems, where too many unknown parameters are estimated from limited response measurements. When the underlying parameters form complex (non-Gaussian) structured spatial connectivity patterns, classical variogram-based geostatistical techniques cannot describe the underlying connectivity patterns. Modern pattern-based geostatistical methods that incorporate higher-order spatial statistics are more suitable for describing such complex spatial patterns. Moreover, when the underlying unknown parameters are discrete (geologic facies distribution), conventional model calibration techniques that are designed for continuous parameters cannot be applied directly. In this paper, we introduce a novel pattern-based model calibration method to reconstruct discrete and spatially complex facies distributions from dynamic flow response data. To reproduce complex connectivity patterns during model calibration, we impose a feasibility constraint to ensure that the solution follows the expected higher-order spatial statistics. For model calibration, we adopt a regularized least-squares formulation, involving data mismatch, pattern connectivity, and feasibility constraint terms. Using an alternating directions optimization algorithm, the regularized objective function is divided into a continuous model calibration problem, followed by mapping the solution onto the feasible set. The feasibility constraint to honor the expected spatial statistics is implemented using a supervised machine learning algorithm. The two steps of the model calibration formulation are repeated until the convergence criterion is met. Several numerical examples are used to evaluate the performance of the developed method.

  17. Fast mean and variance computation of the diffuse sound transmission through finite-sized thick and layered wall and floor systems

    NASA Astrophysics Data System (ADS)

    Decraene, Carolina; Dijckmans, Arne; Reynders, Edwin P. B.

    2018-05-01

    A method is developed for computing the mean and variance of the diffuse field sound transmission loss of finite-sized layered wall and floor systems that consist of solid, fluid and/or poroelastic layers. This is achieved by coupling a transfer matrix model of the wall or floor to statistical energy analysis subsystem models of the adjacent room volumes. The modal behavior of the wall is approximately accounted for by projecting the wall displacement onto a set of sinusoidal lateral basis functions. This hybrid modal transfer matrix-statistical energy analysis method is validated on multiple wall systems: a thin steel plate, a polymethyl methacrylate panel, a thick brick wall, a sandwich panel, a double-leaf wall with poro-elastic material in the cavity, and a double glazing. The predictions are compared with experimental data and with results obtained using alternative prediction methods such as the transfer matrix method with spatial windowing, the hybrid wave based-transfer matrix method, and the hybrid finite element-statistical energy analysis method. These comparisons confirm the prediction accuracy of the proposed method and the computational efficiency against the conventional hybrid finite element-statistical energy analysis method.

  18. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach

    PubMed Central

    Kneifel, Joshua; Webb, David

    2016-01-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF. PMID:27956756

  19. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach.

    PubMed

    Kneifel, Joshua; Webb, David

    2016-09-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF.

  20. Water resources management: Hydrologic characterization through hydrograph simulation may bias streamflow statistics

    NASA Astrophysics Data System (ADS)

    Farmer, W. H.; Kiang, J. E.

    2017-12-01

    The development, deployment and maintenance of water resources management infrastructure and practices rely on hydrologic characterization, which requires an understanding of local hydrology. With regards to streamflow, this understanding is typically quantified with statistics derived from long-term streamgage records. However, a fundamental problem is how to characterize local hydrology without the luxury of streamgage records, a problem that complicates water resources management at ungaged locations and for long-term future projections. This problem has typically been addressed through the development of point estimators, such as regression equations, to estimate particular statistics. Physically-based precipitation-runoff models, which are capable of producing simulated hydrographs, offer an alternative to point estimators. The advantage of simulated hydrographs is that they can be used to compute any number of streamflow statistics from a single source (the simulated hydrograph) rather than relying on a diverse set of point estimators. However, the use of simulated hydrographs introduces a degree of model uncertainty that is propagated through to estimated streamflow statistics and may have drastic effects on management decisions. We compare the accuracy and precision of streamflow statistics (e.g. the mean annual streamflow, the annual maximum streamflow exceeded in 10% of years, and the minimum seven-day average streamflow exceeded in 90% of years, among others) derived from point estimators (e.g. regressions, kriging, machine learning) to that of statistics derived from simulated hydrographs across the continental United States. Initial results suggest that the error introduced through hydrograph simulation may substantially bias the resulting hydrologic characterization.

  1. Modeling panel detection frequencies by queuing system theory: an application in gas chromatography olfactometry.

    PubMed

    Bult, Johannes H F; van Putten, Bram; Schifferstein, Hendrik N J; Roozen, Jacques P; Voragen, Alphons G J; Kroeze, Jan H A

    2004-10-01

    In continuous vigilance tasks, the number of coincident panel responses to stimuli provides an index of stimulus detectability. To determine whether this number is due to chance, panel noise levels have been approximated by the maximum coincidence level obtained in stimulus-free conditions. This study proposes an alternative method by which to assess noise levels, derived from queuing system theory (QST). Instead of critical coincidence levels, QST modeling estimates the duration of coinciding responses in the absence of stimuli. The proposed method has the advantage over previous approaches that it yields more reliable noise estimates and allows for statistical testing. The method was applied in an olfactory detection experiment using 16 panelists in stimulus-present and stimulus-free conditions. We propose that QST may be used as an alternative to signal detection theory for analyzing data from continuous vigilance tasks.

  2. Nine time steps: ultra-fast statistical consistency testing of the Community Earth System Model (pyCECT v3.0)

    NASA Astrophysics Data System (ADS)

    Milroy, Daniel J.; Baker, Allison H.; Hammerling, Dorit M.; Jessup, Elizabeth R.

    2018-02-01

    The Community Earth System Model Ensemble Consistency Test (CESM-ECT) suite was developed as an alternative to requiring bitwise identical output for quality assurance. This objective test provides a statistical measurement of consistency between an accepted ensemble created by small initial temperature perturbations and a test set of CESM simulations. In this work, we extend the CESM-ECT suite with an inexpensive and robust test for ensemble consistency that is applied to Community Atmospheric Model (CAM) output after only nine model time steps. We demonstrate that adequate ensemble variability is achieved with instantaneous variable values at the ninth step, despite rapid perturbation growth and heterogeneous variable spread. We refer to this new test as the Ultra-Fast CAM Ensemble Consistency Test (UF-CAM-ECT) and demonstrate its effectiveness in practice, including its ability to detect small-scale events and its applicability to the Community Land Model (CLM). The new ultra-fast test facilitates CESM development, porting, and optimization efforts, particularly when used to complement information from the original CESM-ECT suite of tools.

  3. Fast Identification of Biological Pathways Associated with a Quantitative Trait Using Group Lasso with Overlaps

    PubMed Central

    Silver, Matt; Montana, Giovanni

    2012-01-01

    Where causal SNPs (single nucleotide polymorphisms) tend to accumulate within biological pathways, the incorporation of prior pathways information into a statistical model is expected to increase the power to detect true associations in a genetic association study. Most existing pathways-based methods rely on marginal SNP statistics and do not fully exploit the dependence patterns among SNPs within pathways. We use a sparse regression model, with SNPs grouped into pathways, to identify causal pathways associated with a quantitative trait. Notable features of our “pathways group lasso with adaptive weights” (P-GLAW) algorithm include the incorporation of all pathways in a single regression model, an adaptive pathway weighting procedure that accounts for factors biasing pathway selection, and the use of a bootstrap sampling procedure for the ranking of important pathways. P-GLAW takes account of the presence of overlapping pathways and uses a novel combination of techniques to optimise model estimation, making it fast to run, even on whole genome datasets. In a comparison study with an alternative pathways method based on univariate SNP statistics, our method demonstrates high sensitivity and specificity for the detection of important pathways, showing the greatest relative gains in performance where marginal SNP effect sizes are small. PMID:22499682

  4. Accounting for the Impact of Management Scenarios on Typha Domingensis (Cattail) in an Everglades Wetland

    NASA Astrophysics Data System (ADS)

    Lagerwall, Gareth; Kiker, Gregory; Muñoz-Carpena, Rafael; Wang, Naiming

    2017-01-01

    The coupled regional simulation model, and the transport and reaction simulation engine were recently adapted to simulate ecology, specifically Typha domingensis (Cattail) dynamics in the Everglades. While Cattail is a native Everglades species, it has become invasive over the years due to an altered habitat over the last few decades, taking over historically Cladium jamaicense (Sawgrass) areas. Two models of different levels of algorithmic complexity were developed in previous studies, and are used here to determine the impact of various management decisions on the average Cattail density within Water Conservation Area 2A in the Everglades. A Global Uncertainty and Sensitivity Analysis was conducted to test the importance of these management scenarios, as well as the effectiveness of using zonal statistics. Management scenarios included high, medium and low initial water depths, soil phosphorus concentrations, initial Cattail and Sawgrass densities, as well as annually alternating water depths and soil phosphorus concentrations, and a steadily decreasing soil phosphorus concentration. Analysis suggests that zonal statistics are good indicators of regional trends, and that high soil phosphorus concentration is a pre-requisite for expansive Cattail growth. It is a complex task to manage Cattail expansion in this region, requiring the close management and monitoring of water depth and soil phosphorus concentration, and possibly other factors not considered in the model complexities. However, this modeling framework with user-definable complexities and management scenarios, can be considered a useful tool in analyzing many more alternatives, which could be used to aid management decisions in the future.

  5. Accounting for the Impact of Management Scenarios on Typha Domingensis (Cattail) in an Everglades Wetland.

    PubMed

    Lagerwall, Gareth; Kiker, Gregory; Muñoz-Carpena, Rafael; Wang, Naiming

    2017-01-01

    The coupled regional simulation model, and the transport and reaction simulation engine were recently adapted to simulate ecology, specifically Typha domingensis (Cattail) dynamics in the Everglades. While Cattail is a native Everglades species, it has become invasive over the years due to an altered habitat over the last few decades, taking over historically Cladium jamaicense (Sawgrass) areas. Two models of different levels of algorithmic complexity were developed in previous studies, and are used here to determine the impact of various management decisions on the average Cattail density within Water Conservation Area 2A in the Everglades. A Global Uncertainty and Sensitivity Analysis was conducted to test the importance of these management scenarios, as well as the effectiveness of using zonal statistics. Management scenarios included high, medium and low initial water depths, soil phosphorus concentrations, initial Cattail and Sawgrass densities, as well as annually alternating water depths and soil phosphorus concentrations, and a steadily decreasing soil phosphorus concentration. Analysis suggests that zonal statistics are good indicators of regional trends, and that high soil phosphorus concentration is a pre-requisite for expansive Cattail growth. It is a complex task to manage Cattail expansion in this region, requiring the close management and monitoring of water depth and soil phosphorus concentration, and possibly other factors not considered in the model complexities. However, this modeling framework with user-definable complexities and management scenarios, can be considered a useful tool in analyzing many more alternatives, which could be used to aid management decisions in the future.

  6. Critical dimensions of trans-sacral corridors assessed by 3D CT models: Relevance for implant positioning in fractures of the sacrum.

    PubMed

    Wagner, Daniel; Kamer, Lukas; Sawaguchi, Takeshi; Geoff Richards, R; Noser, Hansrudi; Uesugi, Masafumi; Ossendorf, Christian; Rommens, Pol M

    2017-11-01

    Trans-sacral implants can be used alternatively to sacro-iliac screws in the treatment of osteoporosis-associated fragility fractures of the pelvis and the sacrum. We investigated trans-sacral corridor dimensions, the number of individuals amenable to trans-sacral fixation, as well as the osseous boundaries and shape of the S1 corridor. 3D models were reconstructed from pelvic CT scans from 92 Europeans and 64 Japanese. A corridor of <12 mm was considered critical for trans-sacral implant positioning, and <8 mm as impossible. A statistical model of trans-sacral corridor S1 was computed. The limiting cranio-caudal diameter was 11.6 mm (±5.4) for S1 and 14 mm (±2.4) for S2. Trans-sacral implant positioning was critical in 52% of cases for S1, and in 21% for S2. The S1 corridor was impossible in 26%, with no impossible corridor in S2. Antero-superiorly, the S1 corridor was limited not only by the sacrum but in 40% by the iliac fossa. The statistical model demonstrated a consistent oval shape of the trans-section of corridor S1. Considering the variable in size and shape of trans-sacral corridors in S1, a thorough anatomical knowledge and preoperative planning are mandatory using trans-sacral implants. In critical cases, S2 is a veritable alternative. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 35:2577-2584, 2017. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.

  7. Predicting survival of Escherichia coli O157:H7 in dry fermented sausage using artificial neural networks.

    PubMed

    Palanichamy, A; Jayas, D S; Holley, R A

    2008-01-01

    The Canadian Food Inspection Agency required the meat industry to ensure Escherichia coli O157:H7 does not survive (experiences > or = 5 log CFU/g reduction) in dry fermented sausage (salami) during processing after a series of foodborne illness outbreaks resulting from this pathogenic bacterium occurred. The industry is in need of an effective technique like predictive modeling for estimating bacterial viability, because traditional microbiological enumeration is a time-consuming and laborious method. The accuracy and speed of artificial neural networks (ANNs) for this purpose is an attractive alternative (developed from predictive microbiology), especially for on-line processing in industry. Data from a study of interactive effects of different levels of pH, water activity, and the concentrations of allyl isothiocyanate at various times during sausage manufacture in reducing numbers of E. coli O157:H7 were collected. Data were used to develop predictive models using a general regression neural network (GRNN), a form of ANN, and a statistical linear polynomial regression technique. Both models were compared for their predictive error, using various statistical indices. GRNN predictions for training and test data sets had less serious errors when compared with the statistical model predictions. GRNN models were better and slightly better for training and test sets, respectively, than was the statistical model. Also, GRNN accurately predicted the level of allyl isothiocyanate required, ensuring a 5-log reduction, when an appropriate production set was created by interpolation. Because they are simple to generate, fast, and accurate, ANN models may be of value for industrial use in dry fermented sausage manufacture to reduce the hazard associated with E. coli O157:H7 in fresh beef and permit production of consistently safe products from this raw material.

  8. ΛCDM model with dissipative nonextensive viscous dark matter

    NASA Astrophysics Data System (ADS)

    Gimenes, H. S.; Viswanathan, G. M.; Silva, R.

    2018-03-01

    Many models in cosmology typically assume the standard bulk viscosity. We study an alternative interpretation for the origin of the bulk viscosity. Using nonadditive statistics proposed by Tsallis, we propose a bulk viscosity component that can only exist by a nonextensive effect through the nonextensive/dissipative correspondence (NexDC). In this paper, we consider a ΛCDM model for a flat universe with a dissipative nonextensive viscous dark matter component, following the Eckart theory of bulk viscosity, without any perturbative approach. In order to analyze cosmological constraints, we use one of the most recent observations of Type Ia Supernova, baryon acoustic oscillations and cosmic microwave background data.

  9. Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

    PubMed

    Ernst, Anja F; Albers, Casper J

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.

  10. Statistical Interior Tomography

    PubMed Central

    Xu, Qiong; Wang, Ge; Sieren, Jered; Hoffman, Eric A.

    2011-01-01

    This paper presents a statistical interior tomography (SIT) approach making use of compressed sensing (CS) theory. With the projection data modeled by the Poisson distribution, an objective function with a total variation (TV) regularization term is formulated in the maximization of a posteriori (MAP) framework to solve the interior problem. An alternating minimization method is used to optimize the objective function with an initial image from the direct inversion of the truncated Hilbert transform. The proposed SIT approach is extensively evaluated with both numerical and real datasets. The results demonstrate that SIT is robust with respect to data noise and down-sampling, and has better resolution and less bias than its deterministic counterpart in the case of low count data. PMID:21233044

  11. Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

    PubMed Central

    Ernst, Anja F.

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971

  12. The Statistical Segment Length of DNA: Opportunities for Biomechanical Modeling in Polymer Physics and Next-Generation Genomics.

    PubMed

    Dorfman, Kevin D

    2018-02-01

    The development of bright bisintercalating dyes for deoxyribonucleic acid (DNA) in the 1990s, most notably YOYO-1, revolutionized the field of polymer physics in the ensuing years. These dyes, in conjunction with modern molecular biology techniques, permit the facile observation of polymer dynamics via fluorescence microscopy and thus direct tests of different theories of polymer dynamics. At the same time, they have played a key role in advancing an emerging next-generation method known as genome mapping in nanochannels. The effect of intercalation on the bending energy of DNA as embodied by a change in its statistical segment length (or, alternatively, its persistence length) has been the subject of significant controversy. The precise value of the statistical segment length is critical for the proper interpretation of polymer physics experiments and controls the phenomena underlying the aforementioned genomics technology. In this perspective, we briefly review the model of DNA as a wormlike chain and a trio of methods (light scattering, optical or magnetic tweezers, and atomic force microscopy (AFM)) that have been used to determine the statistical segment length of DNA. We then outline the disagreement in the literature over the role of bisintercalation on the bending energy of DNA, and how a multiscale biomechanical approach could provide an important model for this scientifically and technologically relevant problem.

  13. Estimating Required Contingency Funds for Construction Projects using Multiple Linear Regression

    DTIC Science & Technology

    2006-03-01

    Breusch - Pagan test , in which the null hypothesis states that the residuals have constant variance. The alternate hypothesis is that the residuals do not...variance, the Breusch - Pagan test provides statistical evidence that the assumption is justified. For the proposed model, the p-value is 0.173...entire test sample. v Acknowledgments First, I would like to acknowledge the influence and help of Greg Hoffman. His work served as the

  14. Bayesian inference and decision theory - A framework for decision making in natural resource management

    USGS Publications Warehouse

    Dorazio, R.M.; Johnson, F.A.

    2003-01-01

    Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.

  15. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media

    PubMed Central

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-01-01

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability—inherent to their nanoporosity—are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery. PMID:27327254

  16. Children's Rights, School Exclusion and Alternative Educational Provision

    ERIC Educational Resources Information Center

    McCluskey, Gillean; Riddell, Sheila; Weedon, Elisabet

    2015-01-01

    This paper examines findings from a recent study in Wales of school exclusion and alternative educational provision. Many, but not all, children in alternative provision have been excluded from school. The most recent statistics reveal that nearly 90% of pupils in alternative provision have special educational needs, nearly 70% are entitled to…

  17. A Computationally Efficient Hypothesis Testing Method for Epistasis Analysis using Multifactor Dimensionality Reduction

    PubMed Central

    Pattin, Kristine A.; White, Bill C.; Barney, Nate; Gui, Jiang; Nelson, Heather H.; Kelsey, Karl R.; Andrew, Angeline S.; Karagas, Margaret R.; Moore, Jason H.

    2008-01-01

    Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free data mining method for detecting, characterizing, and interpreting epistasis in the absence of significant main effects in genetic and epidemiologic studies of complex traits such as disease susceptibility. The goal of MDR is to change the representation of the data using a constructive induction algorithm to make nonadditive interactions easier to detect using any classification method such as naïve Bayes or logistic regression. Traditionally, MDR constructed variables have been evaluated with a naïve Bayes classifier that is combined with 10-fold cross validation to obtain an estimate of predictive accuracy or generalizability of epistasis models. Traditionally, we have used permutation testing to statistically evaluate the significance of models obtained through MDR. The advantage of permutation testing is that it controls for false-positives due to multiple testing. The disadvantage is that permutation testing is computationally expensive. This is in an important issue that arises in the context of detecting epistasis on a genome-wide scale. The goal of the present study was to develop and evaluate several alternatives to large-scale permutation testing for assessing the statistical significance of MDR models. Using data simulated from 70 different epistasis models, we compared the power and type I error rate of MDR using a 1000-fold permutation test with hypothesis testing using an extreme value distribution (EVD). We find that this new hypothesis testing method provides a reasonable alternative to the computationally expensive 1000-fold permutation test and is 50 times faster. We then demonstrate this new method by applying it to a genetic epidemiology study of bladder cancer susceptibility that was previously analyzed using MDR and assessed using a 1000-fold permutation test. PMID:18671250

  18. Exponential series approaches for nonparametric graphical models

    NASA Astrophysics Data System (ADS)

    Janofsky, Eric

    Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.

  19. Transient probabilities for queues with applications to hospital waiting list management.

    PubMed

    Joy, Mark; Jones, Simon

    2005-08-01

    In this paper we study queuing systems within the NHS. Recently imposed government performance targets lead NHS executives to investigate and instigate alternative management strategies, thereby imposing structural changes on the queues. Under such circumstances, it is most unlikely that such systems are in equilibrium. It is crucial, in our opinion, to recognise this state of affairs in order to make a balanced assessment of the role of queue management in the modern NHS. From a mathematical perspective it should be emphasised that measures of the state of a queue based upon the assumption of statistical equilibrium (a pervasive methodology in the study of queues) are simply wrong in the above scenario. To base strategic decisions around such ideas is therefore highly questionable and it is one of the purposes of this paper to offer alternatives: we present some (recent) research whose results generate performance measures and measures of risk, for example, of waiting-times growing unacceptably large; we emphasise that these results concern the transient behaviour of the queueing model-there is no asssumption of statistical equilibrium. We also demonstrate that our results are computationally tractable.

  20. Mediation Analysis with Survival Outcomes: Accelerated Failure Time vs. Proportional Hazards Models.

    PubMed

    Gelfand, Lois A; MacKinnon, David P; DeRubeis, Robert J; Baraldi, Amanda N

    2016-01-01

    Survival time is an important type of outcome variable in treatment research. Currently, limited guidance is available regarding performing mediation analyses with survival outcomes, which generally do not have normally distributed errors, and contain unobserved (censored) events. We present considerations for choosing an approach, using a comparison of semi-parametric proportional hazards (PH) and fully parametric accelerated failure time (AFT) approaches for illustration. We compare PH and AFT models and procedures in their integration into mediation models and review their ability to produce coefficients that estimate causal effects. Using simulation studies modeling Weibull-distributed survival times, we compare statistical properties of mediation analyses incorporating PH and AFT approaches (employing SAS procedures PHREG and LIFEREG, respectively) under varied data conditions, some including censoring. A simulated data set illustrates the findings. AFT models integrate more easily than PH models into mediation models. Furthermore, mediation analyses incorporating LIFEREG produce coefficients that can estimate causal effects, and demonstrate superior statistical properties. Censoring introduces bias in the coefficient estimate representing the treatment effect on outcome-underestimation in LIFEREG, and overestimation in PHREG. With LIFEREG, this bias can be addressed using an alternative estimate obtained from combining other coefficients, whereas this is not possible with PHREG. When Weibull assumptions are not violated, there are compelling advantages to using LIFEREG over PHREG for mediation analyses involving survival-time outcomes. Irrespective of the procedures used, the interpretation of coefficients, effects of censoring on coefficient estimates, and statistical properties should be taken into account when reporting results.

  1. Seventeen-year follow-up of the prospective randomized Nordic CIS study: BCG monotherapy versus alternating therapy with mitomycin C and BCG in patients with carcinoma in situ of the urinary bladder.

    PubMed

    Kaasinen, Eero; Wijkström, Hans; Rintala, Erkki; Mestad, Oddvar; Jahnson, Staffan; Malmström, Per-Uno

    2016-10-01

    The aim of this study was to compare the long-term efficacy of BCG monotherapy to alternating therapy of mitomycin C (MMC) and BCG in patients with carcinoma in situ (CIS). Between 1992 and 1997, 321 patients with CIS were randomized from Finland, Norway and Sweden in a prospective multicenter trial into two treatment groups. The alternating therapy comprised six weekly instillations of MMC 40 mg followed by 10 instillations of BCG (Connaught 120 mg) or MMC alternating monthly for 1 year. BCG monotherapy followed the same 6 + 10 schedule. Stratification was done by nationality and CIS category. Primary endpoints were time to first recurrence and time to progression. Secondary endpoints were disease-specific mortality and overall survival. The main statistical methods were the proportional subdistribution hazards model and Cox proportional hazards model with the cumulative incidence and Kaplan-Meier analyses. The median follow-up time was 9.9 years (maximum 19.9 years) in the BCG group and 8.9 years (maximum 20.3 years) in the alternating group. The risk of recurrence was significantly lower in the BCG group than in the alternating group (49 vs 59% at 15 years, respectively; hazard ratio 0.74, 95% confidence interval 0.54-1.00, p = 0.048). There were no significant differences in the other endpoints. Patients who progressed after 2 years were particularly prone to dying from bladder carcinoma. Younger patients performed worse than older ones. BCG monotherapy including monthly maintenance was effective and better than the alternating therapy. The risk of dying from bladder carcinoma after progression was high.

  2. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method

    PubMed Central

    Roux, Benoît; Weare, Jonathan

    2013-01-01

    An issue of general interest in computer simulations is to incorporate information from experiments into a structural model. An important caveat in pursuing this goal is to avoid corrupting the resulting model with spurious and arbitrary biases. While the problem of biasing thermodynamic ensembles can be formulated rigorously using the maximum entropy method introduced by Jaynes, the approach can be cumbersome in practical applications with the need to determine multiple unknown coefficients iteratively. A popular alternative strategy to incorporate the information from experiments is to rely on restrained-ensemble molecular dynamics simulations. However, the fundamental validity of this computational strategy remains in question. Here, it is demonstrated that the statistical distribution produced by restrained-ensemble simulations is formally consistent with the maximum entropy method of Jaynes. This clarifies the underlying conditions under which restrained-ensemble simulations will yield results that are consistent with the maximum entropy method. PMID:23464140

  3. Optimizing Integrated Terminal Airspace Operations Under Uncertainty

    NASA Technical Reports Server (NTRS)

    Bosson, Christabelle; Xue, Min; Zelinski, Shannon

    2014-01-01

    In the terminal airspace, integrated departures and arrivals have the potential to increase operations efficiency. Recent research has developed geneticalgorithm- based schedulers for integrated arrival and departure operations under uncertainty. This paper presents an alternate method using a machine jobshop scheduling formulation to model the integrated airspace operations. A multistage stochastic programming approach is chosen to formulate the problem and candidate solutions are obtained by solving sample average approximation problems with finite sample size. Because approximate solutions are computed, the proposed algorithm incorporates the computation of statistical bounds to estimate the optimality of the candidate solutions. A proof-ofconcept study is conducted on a baseline implementation of a simple problem considering a fleet mix of 14 aircraft evolving in a model of the Los Angeles terminal airspace. A more thorough statistical analysis is also performed to evaluate the impact of the number of scenarios considered in the sampled problem. To handle extensive sampling computations, a multithreading technique is introduced.

  4. The intermediates take it all: asymptotics of higher criticism statistics and a powerful alternative based on equal local levels.

    PubMed

    Gontscharuk, Veronika; Landwehr, Sandra; Finner, Helmut

    2015-01-01

    The higher criticism (HC) statistic, which can be seen as a normalized version of the famous Kolmogorov-Smirnov statistic, has a long history, dating back to the mid seventies. Originally, HC statistics were used in connection with goodness of fit (GOF) tests but they recently gained some attention in the context of testing the global null hypothesis in high dimensional data. The continuing interest for HC seems to be inspired by a series of nice asymptotic properties related to this statistic. For example, unlike Kolmogorov-Smirnov tests, GOF tests based on the HC statistic are known to be asymptotically sensitive in the moderate tails, hence it is favorably applied for detecting the presence of signals in sparse mixture models. However, some questions around the asymptotic behavior of the HC statistic are still open. We focus on two of them, namely, why a specific intermediate range is crucial for GOF tests based on the HC statistic and why the convergence of the HC distribution to the limiting one is extremely slow. Moreover, the inconsistency in the asymptotic and finite behavior of the HC statistic prompts us to provide a new HC test that has better finite properties than the original HC test while showing the same asymptotics. This test is motivated by the asymptotic behavior of the so-called local levels related to the original HC test. By means of numerical calculations and simulations we show that the new HC test is typically more powerful than the original HC test in normal mixture models. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. The comparison of proportional hazards and accelerated failure time models in analyzing the first birth interval survival data

    NASA Astrophysics Data System (ADS)

    Faruk, Alfensi

    2018-03-01

    Survival analysis is a branch of statistics, which is focussed on the analysis of time- to-event data. In multivariate survival analysis, the proportional hazards (PH) is the most popular model in order to analyze the effects of several covariates on the survival time. However, the assumption of constant hazards in PH model is not always satisfied by the data. The violation of the PH assumption leads to the misinterpretation of the estimation results and decreasing the power of the related statistical tests. On the other hand, the accelerated failure time (AFT) models do not assume the constant hazards in the survival data as in PH model. The AFT models, moreover, can be used as the alternative to PH model if the constant hazards assumption is violated. The objective of this research was to compare the performance of PH model and the AFT models in analyzing the significant factors affecting the first birth interval (FBI) data in Indonesia. In this work, the discussion was limited to three AFT models which were based on Weibull, exponential, and log-normal distribution. The analysis by using graphical approach and a statistical test showed that the non-proportional hazards exist in the FBI data set. Based on the Akaike information criterion (AIC), the log-normal AFT model was the most appropriate model among the other considered models. Results of the best fitted model (log-normal AFT model) showed that the covariates such as women’s educational level, husband’s educational level, contraceptive knowledge, access to mass media, wealth index, and employment status were among factors affecting the FBI in Indonesia.

  6. A more powerful exact test of noninferiority from binary matched-pairs data.

    PubMed

    Lloyd, Chris J; Moldovan, Max V

    2008-08-15

    Assessing the therapeutic noninferiority of one medical treatment compared with another is often based on the difference in response rates from a matched binary pairs design. This paper develops a new exact unconditional test for noninferiority that is more powerful than available alternatives. There are two new elements presented in this paper. First, we introduce the likelihood ratio statistic as an alternative to the previously proposed score statistic of Nam (Biometrics 1997; 53:1422-1430). Second, we eliminate the nuisance parameter by estimation followed by maximization as an alternative to the partial maximization of Berger and Boos (Am. Stat. Assoc. 1994; 89:1012-1016) or traditional full maximization. Based on an extensive numerical study, we recommend tests based on the score statistic, the nuisance parameter being controlled by estimation followed by maximization. 2008 John Wiley & Sons, Ltd

  7. Contingent and Alternative Work Arrangements, Defined.

    ERIC Educational Resources Information Center

    Polivka, Anne E.

    1996-01-01

    Discusses the definitions of contingent workers and alternative work arrangements used by the Bureau of Labor Statistics to analyze data, and presents aggregate estimates of the number of workers in each group. Discusses the overlap between contingent workers and workers in alternative arrangements. (Author/JOW)

  8. Statistical and Machine Learning forecasting methods: Concerns and ways forward

    PubMed Central

    Makridakis, Spyros; Assimakopoulos, Vassilios

    2018-01-01

    Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784

  9. Alternative Regression Equations for Estimation of Annual Peak-Streamflow Frequency for Undeveloped Watersheds in Texas using PRESS Minimization

    USGS Publications Warehouse

    Asquith, William H.; Thompson, David B.

    2008-01-01

    The U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, investigated a refinement of the regional regression method and developed alternative equations for estimation of peak-streamflow frequency for undeveloped watersheds in Texas. A common model for estimation of peak-streamflow frequency is based on the regional regression method. The current (2008) regional regression equations for 11 regions of Texas are based on log10 transformations of all regression variables (drainage area, main-channel slope, and watershed shape). Exclusive use of log10-transformation does not fully linearize the relations between the variables. As a result, some systematic bias remains in the current equations. The bias results in overestimation of peak streamflow for both the smallest and largest watersheds. The bias increases with increasing recurrence interval. The primary source of the bias is the discernible curvilinear relation in log10 space between peak streamflow and drainage area. Bias is demonstrated by selected residual plots with superimposed LOWESS trend lines. To address the bias, a statistical framework based on minimization of the PRESS statistic through power transformation of drainage area is described and implemented, and the resulting regression equations are reported. Compared to log10-exclusive equations, the equations derived from PRESS minimization have PRESS statistics and residual standard errors less than the log10 exclusive equations. Selected residual plots for the PRESS-minimized equations are presented to demonstrate that systematic bias in regional regression equations for peak-streamflow frequency estimation in Texas can be reduced. Because the overall error is similar to the error associated with previous equations and because the bias is reduced, the PRESS-minimized equations reported here provide alternative equations for peak-streamflow frequency estimation.

  10. Model averaging techniques for quantifying conceptual model uncertainty.

    PubMed

    Singh, Abhishek; Mishra, Srikanta; Ruskauff, Greg

    2010-01-01

    In recent years a growing understanding has emerged regarding the need to expand the modeling paradigm to include conceptual model uncertainty for groundwater models. Conceptual model uncertainty is typically addressed by formulating alternative model conceptualizations and assessing their relative likelihoods using statistical model averaging approaches. Several model averaging techniques and likelihood measures have been proposed in the recent literature for this purpose with two broad categories--Monte Carlo-based techniques such as Generalized Likelihood Uncertainty Estimation or GLUE (Beven and Binley 1992) and criterion-based techniques that use metrics such as the Bayesian and Kashyap Information Criteria (e.g., the Maximum Likelihood Bayesian Model Averaging or MLBMA approach proposed by Neuman 2003) and Akaike Information Criterion-based model averaging (AICMA) (Poeter and Anderson 2005). These different techniques can often lead to significantly different relative model weights and ranks because of differences in the underlying statistical assumptions about the nature of model uncertainty. This paper provides a comparative assessment of the four model averaging techniques (GLUE, MLBMA with KIC, MLBMA with BIC, and AIC-based model averaging) mentioned above for the purpose of quantifying the impacts of model uncertainty on groundwater model predictions. Pros and cons of each model averaging technique are examined from a practitioner's perspective using two groundwater modeling case studies. Recommendations are provided regarding the use of these techniques in groundwater modeling practice.

  11. Alternatives to the Randomized Controlled Trial

    PubMed Central

    West, Stephen G.; Duan, Naihua; Pequegnat, Willo; Gaist, Paul; Des Jarlais, Don C.; Holtgrave, David; Szapocznik, José; Fishbein, Martin; Rapkin, Bruce; Clatts, Michael; Mullen, Patricia Dolan

    2008-01-01

    Public health researchers are addressing new research questions (e.g., effects of environmental tobacco smoke, Hurricane Katrina) for which the randomized controlled trial (RCT) may not be a feasible option. Drawing on the potential outcomes framework (Rubin Causal Model) and Campbellian perspectives, we consider alternative research designs that permit relatively strong causal inferences. In randomized encouragement designs, participants are randomly invited to participate in one of the treatment conditions, but are allowed to decide whether to receive treatment. In quantitative assignment designs, treatment is assigned on the basis of a quantitative measure (e.g., need, merit, risk). In observational studies, treatment assignment is unknown and presumed to be nonrandom. Major threats to the validity of each design and statistical strategies for mitigating those threats are presented. PMID:18556609

  12. SIMRAND I- SIMULATION OF RESEARCH AND DEVELOPMENT PROJECTS

    NASA Technical Reports Server (NTRS)

    Miles, R. F.

    1994-01-01

    The Simulation of Research and Development Projects program (SIMRAND) aids in the optimal allocation of R&D resources needed to achieve project goals. SIMRAND models the system subsets or project tasks as various network paths to a final goal. Each path is described in terms of task variables such as cost per hour, cost per unit, availability of resources, etc. Uncertainty is incorporated by treating task variables as probabilistic random variables. SIMRAND calculates the measure of preference for each alternative network. The networks yielding the highest utility function (or certainty equivalence) are then ranked as the optimal network paths. SIMRAND has been used in several economic potential studies at NASA's Jet Propulsion Laboratory involving solar dish power systems and photovoltaic array construction. However, any project having tasks which can be reduced to equations and related by measures of preference can be modeled. SIMRAND analysis consists of three phases: reduction, simulation, and evaluation. In the reduction phase, analytical techniques from probability theory and simulation techniques are used to reduce the complexity of the alternative networks. In the simulation phase, a Monte Carlo simulation is used to derive statistics on the variables of interest for each alternative network path. In the evaluation phase, the simulation statistics are compared and the networks are ranked in preference by a selected decision rule. The user must supply project subsystems in terms of equations based on variables (for example, parallel and series assembly line tasks in terms of number of items, cost factors, time limits, etc). The associated cumulative distribution functions and utility functions for each variable must also be provided (allowable upper and lower limits, group decision factors, etc). SIMRAND is written in Microsoft FORTRAN 77 for batch execution and has been implemented on an IBM PC series computer operating under DOS.

  13. Improving Non-Destructive Concrete Strength Tests Using Support Vector Machines

    PubMed Central

    Shih, Yi-Fan; Wang, Yu-Ren; Lin, Kuo-Liang; Chen, Chin-Wen

    2015-01-01

    Non-destructive testing (NDT) methods are important alternatives when destructive tests are not feasible to examine the in situ concrete properties without damaging the structure. The rebound hammer test and the ultrasonic pulse velocity test are two popular NDT methods to examine the properties of concrete. The rebound of the hammer depends on the hardness of the test specimen and ultrasonic pulse travelling speed is related to density, uniformity, and homogeneity of the specimen. Both of these two methods have been adopted to estimate the concrete compressive strength. Statistical analysis has been implemented to establish the relationship between hammer rebound values/ultrasonic pulse velocities and concrete compressive strength. However, the estimated results can be unreliable. As a result, this research proposes an Artificial Intelligence model using support vector machines (SVMs) for the estimation. Data from 95 cylinder concrete samples are collected to develop and validate the model. The results show that combined NDT methods (also known as SonReb method) yield better estimations than single NDT methods. The results also show that the SVMs model is more accurate than the statistical regression model. PMID:28793627

  14. Genetic programming based models in plant tissue culture: An addendum to traditional statistical approach.

    PubMed

    Mridula, Meenu R; Nair, Ashalatha S; Kumar, K Satheesh

    2018-02-01

    In this paper, we compared the efficacy of observation based modeling approach using a genetic algorithm with the regular statistical analysis as an alternative methodology in plant research. Preliminary experimental data on in vitro rooting was taken for this study with an aim to understand the effect of charcoal and naphthalene acetic acid (NAA) on successful rooting and also to optimize the two variables for maximum result. Observation-based modelling, as well as traditional approach, could identify NAA as a critical factor in rooting of the plantlets under the experimental conditions employed. Symbolic regression analysis using the software deployed here optimised the treatments studied and was successful in identifying the complex non-linear interaction among the variables, with minimalistic preliminary data. The presence of charcoal in the culture medium has a significant impact on root generation by reducing basal callus mass formation. Such an approach is advantageous for establishing in vitro culture protocols as these models will have significant potential for saving time and expenditure in plant tissue culture laboratories, and it further reduces the need for specialised background.

  15. Efficient ensemble forecasting of marine ecology with clustered 1D models and statistical lateral exchange: application to the Red Sea

    NASA Astrophysics Data System (ADS)

    Dreano, Denis; Tsiaras, Kostas; Triantafyllou, George; Hoteit, Ibrahim

    2017-07-01

    Forecasting the state of large marine ecosystems is important for many economic and public health applications. However, advanced three-dimensional (3D) ecosystem models, such as the European Regional Seas Ecosystem Model (ERSEM), are computationally expensive, especially when implemented within an ensemble data assimilation system requiring several parallel integrations. As an alternative to 3D ecological forecasting systems, we propose to implement a set of regional one-dimensional (1D) water-column ecological models that run at a fraction of the computational cost. The 1D model domains are determined using a Gaussian mixture model (GMM)-based clustering method and satellite chlorophyll-a (Chl-a) data. Regionally averaged Chl-a data is assimilated into the 1D models using the singular evolutive interpolated Kalman (SEIK) filter. To laterally exchange information between subregions and improve the forecasting skills, we introduce a new correction step to the assimilation scheme, in which we assimilate a statistical forecast of future Chl-a observations based on information from neighbouring regions. We apply this approach to the Red Sea and show that the assimilative 1D ecological models can forecast surface Chl-a concentration with high accuracy. The statistical assimilation step further improves the forecasting skill by as much as 50%. This general approach of clustering large marine areas and running several interacting 1D ecological models is very flexible. It allows many combinations of clustering, filtering and regression technics to be used and can be applied to build efficient forecasting systems in other large marine ecosystems.

  16. Detecting influential observations in nonlinear regression modeling of groundwater flow

    USGS Publications Warehouse

    Yager, Richard M.

    1998-01-01

    Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.

  17. Groundwater-level prediction using multiple linear regression and artificial neural network techniques: a comparative assessment

    NASA Astrophysics Data System (ADS)

    Sahoo, Sasmita; Jha, Madan K.

    2013-12-01

    The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.

  18. Performance characteristics of a visual-search human-model observer with sparse PET image data

    NASA Astrophysics Data System (ADS)

    Gifford, Howard C.

    2012-02-01

    As predictors of human performance in detection-localization tasks, statistical model observers can have problems with tasks that are primarily limited by target contrast or structural noise. Model observers with a visual-search (VS) framework may provide a more reliable alternative. This framework provides for an initial holistic search that identifies suspicious locations for analysis by a statistical observer. A basic VS observer for emission tomography focuses on hot "blobs" in an image and uses a channelized nonprewhitening (CNPW) observer for analysis. In [1], we investigated this model for a contrast-limited task with SPECT images; herein, a statisticalnoise limited task involving PET images is considered. An LROC study used 2D image slices with liver, lung and soft-tissue tumors. Human and model observers read the images in coronal, sagittal and transverse display formats. The study thus measured the detectability of tumors in a given organ as a function of display format. The model observers were applied under several task variants that tested their response to structural noise both at the organ boundaries alone and over the organs as a whole. As measured by correlation with the human data, the VS observer outperformed the CNPW scanning observer.

  19. Consider the Alternative: The Effects of Causal Knowledge on Representing and Using Alternative Hypotheses in Judgments under Uncertainty

    ERIC Educational Resources Information Center

    Hayes, Brett K.; Hawkins, Guy E.; Newell, Ben R.

    2016-01-01

    Four experiments examined the locus of impact of causal knowledge on consideration of alternative hypotheses in judgments under uncertainty. Two possible loci were examined; overcoming neglect of the alternative when developing a representation of a judgment problem and improving utilization of statistics associated with the alternative…

  20. Atomic clocks and the continuous-time random-walk

    NASA Astrophysics Data System (ADS)

    Formichella, Valerio; Camparo, James; Tavella, Patrizia

    2017-11-01

    Atomic clocks play a fundamental role in many fields, most notably they generate Universal Coordinated Time and are at the heart of all global navigation satellite systems. Notwithstanding their excellent timekeeping performance, their output frequency does vary: it can display deterministic frequency drift; diverse continuous noise processes result in nonstationary clock noise (e.g., random-walk frequency noise, modelled as a Wiener process), and the clock frequency may display sudden changes (i.e., "jumps"). Typically, the clock's frequency instability is evaluated by the Allan or Hadamard variances, whose functional forms can identify the different operative noise processes. Here, we show that the Allan and Hadamard variances of a particular continuous-time random-walk, the compound Poisson process, have the same functional form as for a Wiener process with drift. The compound Poisson process, introduced as a model for observed frequency jumps, is an alternative to the Wiener process for modelling random walk frequency noise. This alternate model fits well the behavior of the rubidium clocks flying on GPS Block-IIR satellites. Further, starting from jump statistics, the model can be improved by considering a more general form of continuous-time random-walk, and this could bring new insights into the physics of atomic clocks.

  1. Statistical Application and Cost Saving in a Dental Survey.

    PubMed

    Chyou, Po-Huang; Schroeder, Dixie; Schwei, Kelsey; Acharya, Amit

    2017-06-01

    To effectively achieve a robust survey response rate in a timely manner, an alternative approach to survey distribution, informed by statistical modeling, was applied to efficiently and cost-effectively achieve the targeted rate of return. A prospective environmental scan surveying adoption of health information technology utilization within their practices was undertaken in a national pool of dental professionals (N=8000) using an alternative method of sampling. The piloted approach to rate of cohort sampling targeted a response rate of 400 completed surveys from among randomly targeted eligible providers who were contacted using replicated subsampling leveraging mailed surveys. Two replicated subsample mailings (n=1000 surveys/mailings) were undertaken to project the true response rate and estimate the total number of surveys required to achieve the final target. Cost effectiveness and non-response bias analyses were performed. The final mailing required approximately 24% fewer mailings compared to targeting of the entire cohort, with a final survey capture exceeding the expected target. An estimated $5000 in cost savings was projected by applying the alternative approach. Non-response analyses found no evidence of bias relative to demographics, practice demographics, or topically-related survey questions. The outcome of this pilot study suggests that this approach to survey studies will accomplish targeted enrollment in a cost effective manner. Future studies are needed to validate this approach in the context of other survey studies. © 2017 Marshfield Clinic.

  2. Statistical Application and Cost Saving in a Dental Survey

    PubMed Central

    Chyou, Po-Huang; Schroeder, Dixie; Schwei, Kelsey; Acharya, Amit

    2017-01-01

    Objective To effectively achieve a robust survey response rate in a timely manner, an alternative approach to survey distribution, informed by statistical modeling, was applied to efficiently and cost-effectively achieve the targeted rate of return. Design A prospective environmental scan surveying adoption of health information technology utilization within their practices was undertaken in a national pool of dental professionals (N=8000) using an alternative method of sampling. The piloted approach to rate of cohort sampling targeted a response rate of 400 completed surveys from among randomly targeted eligible providers who were contacted using replicated subsampling leveraging mailed surveys. Methods Two replicated subsample mailings (n=1000 surveys/mailings) were undertaken to project the true response rate and estimate the total number of surveys required to achieve the final target. Cost effectiveness and non-response bias analyses were performed. Results The final mailing required approximately 24% fewer mailings compared to targeting of the entire cohort, with a final survey capture exceeding the expected target. An estimated $5000 in cost savings was projected by applying the alternative approach. Non-response analyses found no evidence of bias relative to demographics, practice demographics, or topically-related survey questions. Conclusion The outcome of this pilot study suggests that this approach to survey studies will accomplish targeted enrollment in a cost effective manner. Future studies are needed to validate this approach in the context of other survey studies. PMID:28373286

  3. The latent structure of personality functioning: Investigating criterion a from the alternative model for personality disorders in DSM-5.

    PubMed

    Zimmermann, Johannes; Böhnke, Jan R; Eschstruth, Rhea; Mathews, Alessa; Wenzel, Kristin; Leising, Daniel

    2015-08-01

    The alternative model for the classification of personality disorders (PD) in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) Section III comprises 2 major components: impairments in personality functioning (Criterion A) and maladaptive personality traits (Criterion B). In this study, we investigated the latent structure of Criterion A (a) within subdomains, (b) across subdomains, and (c) in conjunction with the Criterion B trait facets. Data were gathered as part of an online study that collected other-ratings by 515 laypersons and 145 therapists. Laypersons were asked to assess 1 of their personal acquaintances, whereas therapists were asked to assess 1 of their patients, using 135 items that captured features of Criteria A and B. We were able to show that (a) the structure within the Criterion A subdomains can be appropriately modeled using generalized graded unfolding models, with results suggesting that the items are indeed related to common underlying constructs but often deviate from their theoretically expected severity level; (b) the structure across subdomains is broadly in line with a model comprising 2 strongly correlated factors of self- and interpersonal functioning, with some notable deviations from the theoretical model; and (c) the joint structure of the Criterion A subdomains and the Criterion B facets broadly resembles the expected model of 2 plus 5 factors, albeit the loading pattern suggests that the distinction between Criteria A and B is somewhat blurry. Our findings provide support for several major assumptions of the alternative DSM-5 model for PD but also highlight aspects of the model that need to be further refined. (c) 2015 APA, all rights reserved).

  4. Quantum theory of multiscale coarse-graining.

    PubMed

    Han, Yining; Jin, Jaehyeok; Wagner, Jacob W; Voth, Gregory A

    2018-03-14

    Coarse-grained (CG) models serve as a powerful tool to simulate molecular systems at much longer temporal and spatial scales. Previously, CG models and methods have been built upon classical statistical mechanics. The present paper develops a theory and numerical methodology for coarse-graining in quantum statistical mechanics, by generalizing the multiscale coarse-graining (MS-CG) method to quantum Boltzmann statistics. A rigorous derivation of the sufficient thermodynamic consistency condition is first presented via imaginary time Feynman path integrals. It identifies the optimal choice of CG action functional and effective quantum CG (qCG) force field to generate a quantum MS-CG (qMS-CG) description of the equilibrium system that is consistent with the quantum fine-grained model projected onto the CG variables. A variational principle then provides a class of algorithms for optimally approximating the qMS-CG force fields. Specifically, a variational method based on force matching, which was also adopted in the classical MS-CG theory, is generalized to quantum Boltzmann statistics. The qMS-CG numerical algorithms and practical issues in implementing this variational minimization procedure are also discussed. Then, two numerical examples are presented to demonstrate the method. Finally, as an alternative strategy, a quasi-classical approximation for the thermal density matrix expressed in the CG variables is derived. This approach provides an interesting physical picture for coarse-graining in quantum Boltzmann statistical mechanics in which the consistency with the quantum particle delocalization is obviously manifest, and it opens up an avenue for using path integral centroid-based effective classical force fields in a coarse-graining methodology.

  5. Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

    NASA Astrophysics Data System (ADS)

    Kacprzak, T.; Kirk, D.; Friedrich, O.; Amara, A.; Refregier, A.; Marian, L.; Dietrich, J. P.; Suchyta, E.; Aleksić, J.; Bacon, D.; Becker, M. R.; Bonnett, C.; Bridle, S. L.; Chang, C.; Eifler, T. F.; Hartley, W. G.; Huff, E. M.; Krause, E.; MacCrann, N.; Melchior, P.; Nicola, A.; Samuroff, S.; Sheldon, E.; Troxel, M. A.; Weller, J.; Zuntz, J.; Abbott, T. M. C.; Abdalla, F. B.; Armstrong, R.; Benoit-Lévy, A.; Bernstein, G. M.; Bernstein, R. A.; Bertin, E.; Brooks, D.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Evrard, A. E.; Neto, A. Fausti; Flaugher, B.; Fosalba, P.; Frieman, J.; Gerdes, D. W.; Goldstein, D. A.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jarvis, M.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Miller, C. J.; Miquel, R.; Mohr, J. J.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Romer, A. K.; Roodman, A.; Rykoff, E. S.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Zhang, Y.; DES Collaboration

    2016-12-01

    Shear peak statistics has gained a lot of attention recently as a practical alternative to the two-point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 deg2 field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range 04 would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two-point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. We discuss prospects for future peak statistics analysis with upcoming DES data.

  6. Prediction of In Vivo Knee Joint Kinematics Using a Combined Dual Fluoroscopy Imaging and Statistical Shape Modeling Technique

    PubMed Central

    Li, Jing-Sheng; Tsai, Tsung-Yuan; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Freiberg, Andrew; Rubash, Harry E.; Li, Guoan

    2014-01-01

    Using computed tomography (CT) or magnetic resonance (MR) images to construct 3D knee models has been widely used in biomedical engineering research. Statistical shape modeling (SSM) method is an alternative way to provide a fast, cost-efficient, and subject-specific knee modeling technique. This study was aimed to evaluate the feasibility of using a combined dual-fluoroscopic imaging system (DFIS) and SSM method to investigate in vivo knee kinematics. Three subjects were studied during a treadmill walking. The data were compared with the kinematics obtained using a CT-based modeling technique. Geometric root-mean-square (RMS) errors between the knee models constructed using the SSM and CT-based modeling techniques were 1.16 mm and 1.40 mm for the femur and tibia, respectively. For the kinematics of the knee during the treadmill gait, the SSM model can predict the knee kinematics with RMS errors within 3.3 deg for rotation and within 2.4 mm for translation throughout the stance phase of the gait cycle compared with those obtained using the CT-based knee models. The data indicated that the combined DFIS and SSM technique could be used for quick evaluation of knee joint kinematics. PMID:25320846

  7. Psychosocial job factors and biological cardiovascular risk factors in Mexican workers.

    PubMed

    Garcia-Rojas, Isabel Judith; Choi, BongKyoo; Krause, Niklas

    2015-03-01

    Psychosocial job factors (PJF) have been implicated in the development of cardiovascular disease. The paucity of data from developing economies including Mexico hampers the development of worksite intervention efforts in those regions. This cross-sectional study of 2,330 Mexican workers assessed PJF (job strain [JS], social support [SS], and job insecurity [JI]) and biological cardiovascular disease risk factors [CVDRF] by questionnaire and on-site physical examinations. Alternative formulations of the JS scales were developed based on factor analysis and literature review. Associations between both traditional and alternative job factor scales with CVDRF were examined in multiple regression models, adjusting for physical workload, and socio-demographic factors. Alternative formulations of the job demand and control scales resulted in substantial changes in effect sizes or statistical significance when compared with the original scales. JS and JI showed hypothesized associations with most CVDRF, but they were inversely associated with diastolic blood pressure and some adiposity measures. SS was mainly protective against CVDRF. Among Mexican workers, alternative PJF scales predicted health outcomes better than traditional scales, and psychosocial stressors were associated with most CVDRF. © 2015 Wiley Periodicals, Inc.

  8. Center of Excellence for Applied Mathematical and Statistical Research in support of development of multicrop production monitoring capability

    NASA Technical Reports Server (NTRS)

    Woodward, W. A.; Gray, H. L.

    1983-01-01

    Efforts in support of the development of multicrop production monitoring capability are reported. In particular, segment level proportion estimation techniques based upon a mixture model were investigated. Efforts have dealt primarily with evaluation of current techniques and development of alternative ones. A comparison of techniques is provided on both simulated and LANDSAT data along with an analysis of the quality of profile variables obtained from LANDSAT data.

  9. Median nitrate concentrations in groundwater in the New Jersey Highlands Region estimated using regression models and land-surface characteristics

    USGS Publications Warehouse

    Baker, Ronald J.; Chepiga, Mary M.; Cauller, Stephen J.

    2015-01-01

    The Kaplan-Meier method of estimating summary statistics from left-censored data was applied in order to include nondetects (left-censored data) in median nitrate-concentration calculations. Median concentrations also were determined using three alternative methods of handling nondetects. Treatment of the 23 percent of samples that were nondetects had little effect on estimated median nitrate concentrations because method detection limits were mostly less than median values.

  10. A Study of Two Instructional Sequences Informed by Alternative Learning Progressions in Genetics

    NASA Astrophysics Data System (ADS)

    Duncan, Ravit Golan; Choi, Jinnie; Castro-Faix, Moraima; Cavera, Veronica L.

    2017-12-01

    Learning progressions (LPs) are hypothetical models of how learning in a domain develops over time with appropriate instruction. In the domain of genetics, there are two independently developed alternative LPs. The main difference between the two progressions hinges on their assumptions regarding the accessibility of classical (Mendelian) versus molecular genetics and the order in which they should be taught. In order to determine the relative difficulty of the different genetic ideas included in the two progressions, and to test which one is a better fit with students' actual learning, we developed two modules in classical and molecular genetics and alternated their sequence in an implementation study with 11th grade students studying biology. We developed a set of 56 ordered multiple-choice items that collectively assessed both molecular and classical genetic ideas. We found significant gains in students' learning in both molecular and classical genetics, with the largest gain relating to understanding the informational content of genes and the smallest gain in understanding modes of inheritance. Using multidimensional item response modeling, we found no statistically significant differences between the two instructional sequences. However, there was a trend of slightly higher gains for the molecular-first sequence for all genetic ideas.

  11. The search for mechanisms of change in behavioral treatments for alcohol use disorders: a commentary.

    PubMed

    Longabaugh, Richard

    2007-10-01

    Definitive results from efforts to identify mechanisms of change in behavioral treatments for alcohol use disorders have been elusive. The working hypothesis guiding this paper is that one of the reasons for this elusiveness is that the models we hypothesize to account for treatments effectiveness are unnecessarily restricted and too simple. This paper aims to accomplish 3 things. First, a typography for locating potential mediators of change will be presented. In the course of doing so, a nomenclature will be proposed with the hope that this will facilitate communications among alcohol treatment researchers studying mechanisms of change. Second, alternatives to the classic test of mediation of alcohol treatment effects will be considered and one such alternative described. Third, alternative ways of conceptualizing, constructing and analyzing variables to measure mediators will be suggested. It is hoped that this commentary will facilitate research on mechanisms of change in behavioral treatments for alcohol use disorders. Behavioral change is a complex process, and the models that we develop to account for this process need to reflect this complexity. Advances in statistical approaches for testing mediation, along with a better understanding as to how to use these tools should help in moving toward this goal.

  12. Dependence of prevalence of contiguous pathways in proteins on structural complexity.

    PubMed

    Thayer, Kelly M; Galganov, Jesse C; Stein, Avram J

    2017-01-01

    Allostery is a regulatory mechanism in proteins where an effector molecule binds distal from an active site to modulate its activity. Allosteric signaling may occur via a continuous path of residues linking the active and allosteric sites, which has been suggested by large conformational changes evident in crystal structures. An alternate possibility is that the signal occurs in the realm of ensemble dynamics via an energy landscape change. While the latter was first proposed on theoretical grounds, increasing evidence suggests that such a control mechanism is plausible. A major difficulty for testing the two methods is the ability to definitively determine that a residue is directly involved in allosteric signal transduction. Statistical Coupling Analysis (SCA) is a method that has been successful at predicting pathways, and experimental tests involving mutagenesis or domain substitution provide the best available evidence of signaling pathways. However, ascertaining energetic pathways which need not be contiguous is far more difficult. To date, simple estimates of the statistical significance of a pathway in a protein remain to be established. The focus of this work is to estimate such benchmarks for the statistical significance of contiguous pathways for the null model of selecting residues at random. We found that when 20% of residues in proteins are randomly selected, contiguous pathways at the 6 Å cutoff level were found with success rates of 51% in PDZ, 30% in p53, and 3% in MutS. The results suggest that the significance of pathways may have system specific factors involved. Furthermore, the possible existence of false positives for contiguous pathways implies that signaling could be occurring via alternate routes including those consistent with the energetic landscape model.

  13. Learning Outcomes in a Laboratory Environment vs. Classroom for Statistics Instruction: An Alternative Approach Using Statistical Software

    ERIC Educational Resources Information Center

    McCulloch, Ryan Sterling

    2017-01-01

    The role of any statistics course is to increase the understanding and comprehension of statistical concepts and those goals can be achieved via both theoretical instruction and statistical software training. However, many introductory courses either forego advanced software usage, or leave its use to the student as a peripheral activity. The…

  14. Can climate variability information constrain a hydrological model for an ungauged Costa Rican catchment?

    NASA Astrophysics Data System (ADS)

    Quesada-Montano, Beatriz; Westerberg, Ida K.; Fuentes-Andino, Diana; Hidalgo-Leon, Hugo; Halldin, Sven

    2017-04-01

    Long-term hydrological data are key to understanding catchment behaviour and for decision making within water management and planning. Given the lack of observed data in many regions worldwide, hydrological models are an alternative for reproducing historical streamflow series. Additional types of information - to locally observed discharge - can be used to constrain model parameter uncertainty for ungauged catchments. Climate variability exerts a strong influence on streamflow variability on long and short time scales, in particular in the Central-American region. We therefore explored the use of climate variability knowledge to constrain the simulated discharge uncertainty of a conceptual hydrological model applied to a Costa Rican catchment, assumed to be ungauged. To reduce model uncertainty we first rejected parameter relationships that disagreed with our understanding of the system. We then assessed how well climate-based constraints applied at long-term, inter-annual and intra-annual time scales could constrain model uncertainty. Finally, we compared the climate-based constraints to a constraint on low-flow statistics based on information obtained from global maps. We evaluated our method in terms of the ability of the model to reproduce the observed hydrograph and the active catchment processes in terms of two efficiency measures, a statistical consistency measure, a spread measure and 17 hydrological signatures. We found that climate variability knowledge was useful for reducing model uncertainty, in particular, unrealistic representation of deep groundwater processes. The constraints based on global maps of low-flow statistics provided more constraining information than those based on climate variability, but the latter rejected slow rainfall-runoff representations that the low flow statistics did not reject. The use of such knowledge, together with information on low-flow statistics and constraints on parameter relationships showed to be useful to constrain model uncertainty for an - assumed to be - ungauged basin. This shows that our method is promising for reconstructing long-term flow data for ungauged catchments on the Pacific side of Central America, and that similar methods can be developed for ungauged basins in other regions where climate variability exerts a strong control on streamflow variability.

  15. A Technology-Based Statistical Reasoning Assessment Tool in Descriptive Statistics for Secondary School Students

    ERIC Educational Resources Information Center

    Chan, Shiau Wei; Ismail, Zaleha

    2014-01-01

    The focus of assessment in statistics has gradually shifted from traditional assessment towards alternative assessment where more attention has been paid to the core statistical concepts such as center, variability, and distribution. In spite of this, there are comparatively few assessments that combine the significant three types of statistical…

  16. An Integrated Analysis of the Physiological Effects of Space Flight: Executive Summary

    NASA Technical Reports Server (NTRS)

    Leonard, J. I.

    1985-01-01

    A large array of models were applied in a unified manner to solve problems in space flight physiology. Mathematical simulation was used as an alternative way of looking at physiological systems and maximizing the yield from previous space flight experiments. A medical data analysis system was created which consist of an automated data base, a computerized biostatistical and data analysis system, and a set of simulation models of physiological systems. Five basic models were employed: (1) a pulsatile cardiovascular model; (2) a respiratory model; (3) a thermoregulatory model; (4) a circulatory, fluid, and electrolyte balance model; and (5) an erythropoiesis regulatory model. Algorithms were provided to perform routine statistical tests, multivariate analysis, nonlinear regression analysis, and autocorrelation analysis. Special purpose programs were prepared for rank correlation, factor analysis, and the integration of the metabolic balance data.

  17. Calibration and use of an interactive-accounting model to simulate dissolved solids, streamflow, and water-supply operations in the Arkansas River basin, Colorado

    USGS Publications Warehouse

    Burns, A.W.

    1989-01-01

    An interactive-accounting model was used to simulate dissolved solids, streamflow, and water supply operations in the Arkansas River basin, Colorado. Model calibration of specific conductance to streamflow relations at three sites enabled computation of dissolved-solids loads throughout the basin. To simulate streamflow only, all water supply operations were incorporated in the regression relations for streamflow. Calibration for 1940-85 resulted in coefficients of determination that ranged from 0.89 to 0.58, and values in excess of 0.80 were determined for 16 of 20 nodes. The model then incorporated 74 water users and 11 reservoirs to simulate the water supply operations for two periods, 1943-74 and 1975-85. For the 1943-74 calibration, coefficients of determination for streamflow ranged from 0.87 to 0.02. Calibration of the water supply operations resulted in coefficients of determination that ranged from 0.87 to negative for simulated irrigation diversions of 37 selected water users. Calibration for 1975-85 was not evaluated statistically, but average values and plots of reservoir contents indicated reasonableness of the simulation. To demonstrate the utility of the model, six specific alternatives were simulated to consider effects of potential enlargement of Pueblo Reservoir. Three general major alternatives were simulated: the 1975-85 calibrated model data, the calibrated model data with an addition of 30 cu ft/sec in Fountain Creek flows, and the calibrated model data plus additional municipal water in storage. These three major alternatives considered the options of reservoir enlargement or no enlargement. A 40,000-acre-foot reservoir enlargement resulted in average increases of 2,500 acre-ft in transmountain diversions, of 800 acre-ft in storage diversions, and of 100 acre-ft in winter-water storage. (USGS)

  18. Identification of market trends with string and D2-brane maps

    NASA Astrophysics Data System (ADS)

    Bartoš, Erik; Pinčák, Richard

    2017-08-01

    The multidimensional string objects are introduced as a new alternative for an application of string models for time series forecasting in trading on financial markets. The objects are represented by open string with 2-endpoints and D2-brane, which are continuous enhancement of 1-endpoint open string model. We show how new object properties can change the statistics of the predictors, which makes them the candidates for modeling a wide range of time series systems. String angular momentum is proposed as another tool to analyze the stability of currency rates except the historical volatility. To show the reliability of our approach with application of string models for time series forecasting we present the results of real demo simulations for four currency exchange pairs.

  19. Error models for official mortality forecasts.

    PubMed

    Alho, J M; Spencer, B D

    1990-09-01

    "The Office of the Actuary, U.S. Social Security Administration, produces alternative forecasts of mortality to reflect uncertainty about the future.... In this article we identify the components and assumptions of the official forecasts and approximate them by stochastic parametric models. We estimate parameters of the models from past data, derive statistical intervals for the forecasts, and compare them with the official high-low intervals. We use the models to evaluate the forecasts rather than to develop different predictions of the future. Analysis of data from 1972 to 1985 shows that the official intervals for mortality forecasts for males or females aged 45-70 have approximately a 95% chance of including the true mortality rate in any year. For other ages the chances are much less than 95%." excerpt

  20. Effect of heating rate and kinetic model selection on activation energy of nonisothermal crystallization of amorphous felodipine.

    PubMed

    Chattoraj, Sayantan; Bhugra, Chandan; Li, Zheng Jane; Sun, Changquan Calvin

    2014-12-01

    The nonisothermal crystallization kinetics of amorphous materials is routinely analyzed by statistically fitting the crystallization data to kinetic models. In this work, we systematically evaluate how the model-dependent crystallization kinetics is impacted by variations in the heating rate and the selection of the kinetic model, two key factors that can lead to significant differences in the crystallization activation energy (Ea ) of an amorphous material. Using amorphous felodipine, we show that the Ea decreases with increase in the heating rate, irrespective of the kinetic model evaluated in this work. The model that best describes the crystallization phenomenon cannot be identified readily through the statistical fitting approach because several kinetic models yield comparable R(2) . Here, we propose an alternate paired model-fitting model-free (PMFMF) approach for identifying the most suitable kinetic model, where Ea obtained from model-dependent kinetics is compared with those obtained from model-free kinetics. The most suitable kinetic model is identified as the one that yields Ea values comparable with the model-free kinetics. Through this PMFMF approach, nucleation and growth is identified as the main mechanism that controls the crystallization kinetics of felodipine. Using this PMFMF approach, we further demonstrate that crystallization mechanism from amorphous phase varies with heating rate. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.

  1. Implications of convection in the moon and the terrestrial planets

    NASA Technical Reports Server (NTRS)

    Turcotte, Donald L.

    1991-01-01

    A comprehensive review is made of the thermal chemical evolution of the moon and the terrestrial planets. New results are presented which were obtained for Venus by the Magellan Mission the efforts were concentrated on this planet. Alternative models were examined for the thermal structure of the lithosphere of Venus. The statistical distribution was studied of the locations of the coronae on Venus. Models were examined for the patterns of faulting around the coronae on Venus. A series was considered of viscous models for the development and relaxation of elevation anomalies on Venus. And rates were studied of solidification of volcanic flows on Venus. Both radiative and convective heat transfer were considered.

  2. Accounting for multiple sources of uncertainty in impact assessments: The example of the BRACE study

    NASA Astrophysics Data System (ADS)

    O'Neill, B. C.

    2015-12-01

    Assessing climate change impacts often requires the use of multiple scenarios, types of models, and data sources, leading to a large number of potential sources of uncertainty. For example, a single study might require a choice of a forcing scenario, climate model, bias correction and/or downscaling method, societal development scenario, model (typically several) for quantifying elements of societal development such as economic and population growth, biophysical model (such as for crop yields or hydrology), and societal impact model (e.g. economic or health model). Some sources of uncertainty are reduced or eliminated by the framing of the question. For example, it may be useful to ask what an impact outcome would be conditional on a given societal development pathway, forcing scenario, or policy. However many sources of uncertainty remain, and it is rare for all or even most of these sources to be accounted for. I use the example of a recent integrated project on the Benefits of Reduced Anthropogenic Climate changE (BRACE) to explore useful approaches to uncertainty across multiple components of an impact assessment. BRACE comprises 23 papers that assess the differences in impacts between two alternative climate futures: those associated with Representative Concentration Pathways (RCPs) 4.5 and 8.5. It quantifies difference in impacts in terms of extreme events, health, agriculture, tropical cyclones, and sea level rise. Methodologically, it includes climate modeling, statistical analysis, integrated assessment modeling, and sector-specific impact modeling. It employs alternative scenarios of both radiative forcing and societal development, but generally uses a single climate model (CESM), partially accounting for climate uncertainty by drawing heavily on large initial condition ensembles. Strengths and weaknesses of the approach to uncertainty in BRACE are assessed. Options under consideration for improving the approach include the use of perturbed physics ensembles of CESM, employing results from multiple climate models, and combining the results from single impact models with statistical representations of uncertainty across multiple models. A key consideration is the relationship between the question being addressed and the uncertainty approach.

  3. Hybrid modeling as a QbD/PAT tool in process development: an industrial E. coli case study.

    PubMed

    von Stosch, Moritz; Hamelink, Jan-Martijn; Oliveira, Rui

    2016-05-01

    Process understanding is emphasized in the process analytical technology initiative and the quality by design paradigm to be essential for manufacturing of biopharmaceutical products with consistent high quality. A typical approach to developing a process understanding is applying a combination of design of experiments with statistical data analysis. Hybrid semi-parametric modeling is investigated as an alternative method to pure statistical data analysis. The hybrid model framework provides flexibility to select model complexity based on available data and knowledge. Here, a parametric dynamic bioreactor model is integrated with a nonparametric artificial neural network that describes biomass and product formation rates as function of varied fed-batch fermentation conditions for high cell density heterologous protein production with E. coli. Our model can accurately describe biomass growth and product formation across variations in induction temperature, pH and feed rates. The model indicates that while product expression rate is a function of early induction phase conditions, it is negatively impacted as productivity increases. This could correspond with physiological changes due to cytoplasmic product accumulation. Due to the dynamic nature of the model, rational process timing decisions can be made and the impact of temporal variations in process parameters on product formation and process performance can be assessed, which is central for process understanding.

  4. Cube search, revisited.

    PubMed

    Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth

    2015-03-16

    Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with "equivalent" 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. © 2015 ARVO.

  5. Cube search, revisited

    PubMed Central

    Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth

    2015-01-01

    Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with “equivalent” 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. PMID:25780063

  6. A statistical pixel intensity model for segmentation of confocal laser scanning microscopy images.

    PubMed

    Calapez, Alexandre; Rosa, Agostinho

    2010-09-01

    Confocal laser scanning microscopy (CLSM) has been widely used in the life sciences for the characterization of cell processes because it allows the recording of the distribution of fluorescence-tagged macromolecules on a section of the living cell. It is in fact the cornerstone of many molecular transport and interaction quantification techniques where the identification of regions of interest through image segmentation is usually a required step. In many situations, because of the complexity of the recorded cellular structures or because of the amounts of data involved, image segmentation either is too difficult or inefficient to be done by hand and automated segmentation procedures have to be considered. Given the nature of CLSM images, statistical segmentation methodologies appear as natural candidates. In this work we propose a model to be used for statistical unsupervised CLSM image segmentation. The model is derived from the CLSM image formation mechanics and its performance is compared to the existing alternatives. Results show that it provides a much better description of the data on classes characterized by their mean intensity, making it suitable not only for segmentation methodologies with known number of classes but also for use with schemes aiming at the estimation of the number of classes through the application of cluster selection criteria.

  7. 77 FR 74103 - Alternatives to the Use of Credit Ratings

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-12-13

    ... NPRM identified references made to nationally recognized statistical rating organization (NRSRO) \\3... or external credit risk assessments (including credit ratings), and default statistics. The preamble...); Default statistics (i.e., whether providers of credit information relating to securities express a view...

  8. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  9. Power Law Patch Scaling and Lack of Characteristic Wavelength Suggest "Scale-Free" Processes Drive Pattern Formation in the Florida Everglades

    NASA Astrophysics Data System (ADS)

    Kaplan, D. A.; Casey, S. T.; Cohen, M. J.; Acharya, S.; Jawitz, J. W.

    2016-12-01

    A century of hydrologic modification has altered the physical and biological drivers of landscape processes in the Everglades (Florida, USA). Restoring the ridge-slough patterned landscape, a dominant feature of the historical system, is a priority, but requires an understanding of pattern genesis and degradation mechanisms. Physical experiments to evaluate alternative pattern formation mechanisms are limited by the long time scales of peat accumulation and loss, necessitating model-based comparisons, where support for a particular mechanism is based on model replication of extant patterning and trajectories of degradation. However, multiple mechanisms yield patch elongation in the direction of historical flow (a central feature of ridge-slough patterning), limiting the utility of that characteristic for discriminating among alternatives. Using data from vegetation maps, we investigated the statistical features of ridge-slough spatial patterning (ridge density, patch perimeter, elongation, patch-size distributions, and spatial periodicity) to establish more rigorous criteria for evaluating model performance and to inform controls on pattern variation across the contemporary system. Two independent analyses (2-D periodograms and patch size distributions) provide strong evidence against regular patterning, with the landscape exhibiting neither a characteristic wavelength nor a characteristic patch size, both of which are expected under conditions that produce regular patterns. Rather, landscape properties suggest robust scale-free patterning, indicating genesis from the coupled effects of local facilitation and a global negative feedback operating uniformly at the landscape-scale. This finding challenges widespread invocation of scale-dependent negative feedbacks for explaining ridge-slough pattern origins. These results help discern among genesis mechanisms and provide an improved statistical description of the landscape that can be used to compare among model outputs, as well as to assess the success of future restoration projects.

  10. Alternate Forms of the State-Trait Anxiety Inventory.

    ERIC Educational Resources Information Center

    Devito, Anthony J.; Kubis, Joseph F.

    1983-01-01

    Alternate forms of the state anxiety (A-State) and trait anxiety (A-Trait) scales of the State-Trait Anxiety Inventory (STAI) were constructed by dividing the 20 items of each scale into two briefer forms having 10 items each. The alternate forms and item statistics are presented. (Author/BW)

  11. Data mining of tree-based models to analyze freeway accident frequency.

    PubMed

    Chang, Li-Yen; Chen, Wen-Chieh

    2005-01-01

    Statistical models, such as Poisson or negative binomial regression models, have been employed to analyze vehicle accident frequency for many years. However, these models have their own model assumptions and pre-defined underlying relationship between dependent and independent variables. If these assumptions are violated, the model could lead to erroneous estimation of accident likelihood. Classification and Regression Tree (CART), one of the most widely applied data mining techniques, has been commonly employed in business administration, industry, and engineering. CART does not require any pre-defined underlying relationship between target (dependent) variable and predictors (independent variables) and has been shown to be a powerful tool, particularly for dealing with prediction and classification problems. This study collected the 2001-2002 accident data of National Freeway 1 in Taiwan. A CART model and a negative binomial regression model were developed to establish the empirical relationship between traffic accidents and highway geometric variables, traffic characteristics, and environmental factors. The CART findings indicated that the average daily traffic volume and precipitation variables were the key determinants for freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.

  12. Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation.

    PubMed

    Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram

    2018-05-01

    DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES.

    PubMed

    Zhou, Xiang

    2017-12-01

    Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z -scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.

  14. Simulation study to determine the impact of different design features on design efficiency in discrete choice experiments

    PubMed Central

    Vanniyasingam, Thuva; Cunningham, Charles E; Foster, Gary; Thabane, Lehana

    2016-01-01

    Objectives Discrete choice experiments (DCEs) are routinely used to elicit patient preferences to improve health outcomes and healthcare services. While many fractional factorial designs can be created, some are more statistically optimal than others. The objective of this simulation study was to investigate how varying the number of (1) attributes, (2) levels within attributes, (3) alternatives and (4) choice tasks per survey will improve or compromise the statistical efficiency of an experimental design. Design and methods A total of 3204 DCE designs were created to assess how relative design efficiency (d-efficiency) is influenced by varying the number of choice tasks (2–20), alternatives (2–5), attributes (2–20) and attribute levels (2–5) of a design. Choice tasks were created by randomly allocating attribute and attribute level combinations into alternatives. Outcome Relative d-efficiency was used to measure the optimality of each DCE design. Results DCE design complexity influenced statistical efficiency. Across all designs, relative d-efficiency decreased as the number of attributes and attribute levels increased. It increased for designs with more alternatives. Lastly, relative d-efficiency converges as the number of choice tasks increases, where convergence may not be at 100% statistical optimality. Conclusions Achieving 100% d-efficiency is heavily dependent on the number of attributes, attribute levels, choice tasks and alternatives. Further exploration of overlaps and block sizes are needed. This study's results are widely applicable for researchers interested in creating optimal DCE designs to elicit individual preferences on health services, programmes, policies and products. PMID:27436671

  15. Comparison of Response Surface and Kriging Models in the Multidisciplinary Design of an Aerospike Nozzle

    NASA Technical Reports Server (NTRS)

    Simpson, Timothy W.

    1998-01-01

    The use of response surface models and kriging models are compared for approximating non-random, deterministic computer analyses. After discussing the traditional response surface approach for constructing polynomial models for approximation, kriging is presented as an alternative statistical-based approximation method for the design and analysis of computer experiments. Both approximation methods are applied to the multidisciplinary design and analysis of an aerospike nozzle which consists of a computational fluid dynamics model and a finite element analysis model. Error analysis of the response surface and kriging models is performed along with a graphical comparison of the approximations. Four optimization problems are formulated and solved using both approximation models. While neither approximation technique consistently outperforms the other in this example, the kriging models using only a constant for the underlying global model and a Gaussian correlation function perform as well as the second order polynomial response surface models.

  16. Efficient Generation and Selection of Virtual Populations in Quantitative Systems Pharmacology Models.

    PubMed

    Allen, R J; Rieger, T R; Musante, C J

    2016-03-01

    Quantitative systems pharmacology models mechanistically describe a biological system and the effect of drug treatment on system behavior. Because these models rarely are identifiable from the available data, the uncertainty in physiological parameters may be sampled to create alternative parameterizations of the model, sometimes termed "virtual patients." In order to reproduce the statistics of a clinical population, virtual patients are often weighted to form a virtual population that reflects the baseline characteristics of the clinical cohort. Here we introduce a novel technique to efficiently generate virtual patients and, from this ensemble, demonstrate how to select a virtual population that matches the observed data without the need for weighting. This approach improves confidence in model predictions by mitigating the risk that spurious virtual patients become overrepresented in virtual populations.

  17. Constraints on holographic cosmologies from strong lensing systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cárdenas, Víctor H.; Bonilla, Alexander; Motta, Verónica

    We use strongly gravitationally lensed (SGL) systems to put additional constraints on a set of holographic dark energy models. Data available in the literature (redshift and velocity dispersion) is used to obtain the Einstein radius and compare it with model predictions. We found that the ΛCDM is the best fit to the data. Although a preliminary statistical analysis seems to indicate that two of the holographic models studied show interesting agreement with observations, a stringent test lead us to the result that neither of the holographic models are competitive with the ΛCDM. These results highlight the importance of Strong Lensingmore » measurements to provide additional observational constraints to alternative cosmological models, which are necessary to shed some light into the dark universe.« less

  18. Mediation Analysis with Survival Outcomes: Accelerated Failure Time vs. Proportional Hazards Models

    PubMed Central

    Gelfand, Lois A.; MacKinnon, David P.; DeRubeis, Robert J.; Baraldi, Amanda N.

    2016-01-01

    Objective: Survival time is an important type of outcome variable in treatment research. Currently, limited guidance is available regarding performing mediation analyses with survival outcomes, which generally do not have normally distributed errors, and contain unobserved (censored) events. We present considerations for choosing an approach, using a comparison of semi-parametric proportional hazards (PH) and fully parametric accelerated failure time (AFT) approaches for illustration. Method: We compare PH and AFT models and procedures in their integration into mediation models and review their ability to produce coefficients that estimate causal effects. Using simulation studies modeling Weibull-distributed survival times, we compare statistical properties of mediation analyses incorporating PH and AFT approaches (employing SAS procedures PHREG and LIFEREG, respectively) under varied data conditions, some including censoring. A simulated data set illustrates the findings. Results: AFT models integrate more easily than PH models into mediation models. Furthermore, mediation analyses incorporating LIFEREG produce coefficients that can estimate causal effects, and demonstrate superior statistical properties. Censoring introduces bias in the coefficient estimate representing the treatment effect on outcome—underestimation in LIFEREG, and overestimation in PHREG. With LIFEREG, this bias can be addressed using an alternative estimate obtained from combining other coefficients, whereas this is not possible with PHREG. Conclusions: When Weibull assumptions are not violated, there are compelling advantages to using LIFEREG over PHREG for mediation analyses involving survival-time outcomes. Irrespective of the procedures used, the interpretation of coefficients, effects of censoring on coefficient estimates, and statistical properties should be taken into account when reporting results. PMID:27065906

  19. PyEvolve: a toolkit for statistical modelling of molecular evolution.

    PubMed

    Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A

    2004-01-05

    Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.

  20. Predicting network modules of cell cycle regulators using relative protein abundance statistics.

    PubMed

    Oguz, Cihan; Watson, Layne T; Baumann, William T; Tyson, John J

    2017-02-28

    Parameter estimation in systems biology is typically done by enforcing experimental observations through an objective function as the parameter space of a model is explored by numerical simulations. Past studies have shown that one usually finds a set of "feasible" parameter vectors that fit the available experimental data equally well, and that these alternative vectors can make different predictions under novel experimental conditions. In this study, we characterize the feasible region of a complex model of the budding yeast cell cycle under a large set of discrete experimental constraints in order to test whether the statistical features of relative protein abundance predictions are influenced by the topology of the cell cycle regulatory network. Using differential evolution, we generate an ensemble of feasible parameter vectors that reproduce the phenotypes (viable or inviable) of wild-type yeast cells and 110 mutant strains. We use this ensemble to predict the phenotypes of 129 mutant strains for which experimental data is not available. We identify 86 novel mutants that are predicted to be viable and then rank the cell cycle proteins in terms of their contributions to cumulative variability of relative protein abundance predictions. Proteins involved in "regulation of cell size" and "regulation of G1/S transition" contribute most to predictive variability, whereas proteins involved in "positive regulation of transcription involved in exit from mitosis," "mitotic spindle assembly checkpoint" and "negative regulation of cyclin-dependent protein kinase by cyclin degradation" contribute the least. These results suggest that the statistics of these predictions may be generating patterns specific to individual network modules (START, S/G2/M, and EXIT). To test this hypothesis, we develop random forest models for predicting the network modules of cell cycle regulators using relative abundance statistics as model inputs. Predictive performance is assessed by the areas under receiver operating characteristics curves (AUC). Our models generate an AUC range of 0.83-0.87 as opposed to randomized models with AUC values around 0.50. By using differential evolution and random forest modeling, we show that the model prediction statistics generate distinct network module-specific patterns within the cell cycle network.

  1. Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics

    PubMed Central

    Dybowski, Richard; McKinley, Trevelyan J.; Mastroeni, Pietro; Restif, Olivier

    2013-01-01

    Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered. PMID:24376528

  2. Comparing hierarchical models via the marginalized deviance information criterion.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2018-07-20

    Hierarchical models are extensively used in pharmacokinetics and longitudinal studies. When the estimation is performed from a Bayesian approach, model comparison is often based on the deviance information criterion (DIC). In hierarchical models with latent variables, there are several versions of this statistic: the conditional DIC (cDIC) that incorporates the latent variables in the focus of the analysis and the marginalized DIC (mDIC) that integrates them out. Regardless of the asymptotic and coherency difficulties of cDIC, this alternative is usually used in Markov chain Monte Carlo (MCMC) methods for hierarchical models because of practical convenience. The mDIC criterion is more appropriate in most cases but requires integration of the likelihood, which is computationally demanding and not implemented in Bayesian software. Therefore, we consider a method to compute mDIC by generating replicate samples of the latent variables that need to be integrated out. This alternative can be easily conducted from the MCMC output of Bayesian packages and is widely applicable to hierarchical models in general. Additionally, we propose some approximations in order to reduce the computational complexity for large-sample situations. The method is illustrated with simulated data sets and 2 medical studies, evidencing that cDIC may be misleading whilst mDIC appears pertinent. Copyright © 2018 John Wiley & Sons, Ltd.

  3. 40 CFR 1065.12 - Approval of alternate procedures.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... engine meets all applicable emission standards according to specified procedures. (iii) Use statistical.... (e) We may give you specific directions regarding methods for statistical analysis, or we may approve... statistical tests. Perform the tests as follows: (1) Repeat measurements for all applicable duty cycles at...

  4. Normality of raw data in general linear models: The most widespread myth in statistics

    USGS Publications Warehouse

    Kery, Marc; Hatfield, Jeff S.

    2003-01-01

    In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.

  5. It's not that Difficult: An Interrater Reliability Study of the DSM–5 Section III Alternative Model for Personality Disorders

    DOE PAGES

    Garcia, Darren J.; Skadberg, Rebecca M.; Schmidt, Megan; ...

    2018-03-05

    The Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM–5]; American Psychiatric Association, 2013) Section III Alternative Model for Personality Disorders (AMPD) represents a novel approach to the diagnosis of personality disorder (PD). In this model, PD diagnosis requires evaluation of level of impairment in personality functioning (Criterion A) and characterization by pathological traits (Criterion B). Questions about clinical utility, complexity, and difficulty in learning and using the AMPD have been expressed in recent scholarly literature. We examined the learnability, interrater reliability, and clinical utility of the AMPD using a vignette methodology and graduate student raters. Results showed thatmore » student clinicians can learn Criterion A of the AMPD to a high level of interrater reliability and agreement with expert ratings. Interrater reliability of the 25 trait facets of the AMPD varied but showed overall acceptable levels of agreement. Examination of severity indexes of PD impairment showed the level of personality functioning (LPF) added information beyond that of global assessment of functioning (GAF). Clinical utility ratings were generally strong. Lastly, the satisfactory interrater reliability of components of the AMPD indicates the model, including the LPF, is very learnable.« less

  6. Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences

    PubMed Central

    Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric

    2016-01-01

    Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566

  7. It's not that Difficult: An Interrater Reliability Study of the DSM–5 Section III Alternative Model for Personality Disorders

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garcia, Darren J.; Skadberg, Rebecca M.; Schmidt, Megan

    The Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM–5]; American Psychiatric Association, 2013) Section III Alternative Model for Personality Disorders (AMPD) represents a novel approach to the diagnosis of personality disorder (PD). In this model, PD diagnosis requires evaluation of level of impairment in personality functioning (Criterion A) and characterization by pathological traits (Criterion B). Questions about clinical utility, complexity, and difficulty in learning and using the AMPD have been expressed in recent scholarly literature. We examined the learnability, interrater reliability, and clinical utility of the AMPD using a vignette methodology and graduate student raters. Results showed thatmore » student clinicians can learn Criterion A of the AMPD to a high level of interrater reliability and agreement with expert ratings. Interrater reliability of the 25 trait facets of the AMPD varied but showed overall acceptable levels of agreement. Examination of severity indexes of PD impairment showed the level of personality functioning (LPF) added information beyond that of global assessment of functioning (GAF). Clinical utility ratings were generally strong. Lastly, the satisfactory interrater reliability of components of the AMPD indicates the model, including the LPF, is very learnable.« less

  8. New Casemix Classification as an Alternative Method for Budget Allocation in Thai Oral Healthcare Service: A Pilot Study

    PubMed Central

    Wisaijohn, Thunthita; Pimkhaokham, Atiphan; Lapying, Phenkhae; Itthichaisri, Chumpot; Pannarunothai, Supasit; Igarashi, Isao; Kawabuchi, Koichi

    2010-01-01

    This study aimed to develop a new casemix classification system as an alternative method for the budget allocation of oral healthcare service (OHCS). Initially, the International Statistical of Diseases and Related Health Problem, 10th revision, Thai Modification (ICD-10-TM) related to OHCS was used for developing the software “Grouper”. This model was designed to allow the translation of dental procedures into eight-digit codes. Multiple regression analysis was used to analyze the relationship between the factors used for developing the model and the resource consumption. Furthermore, the coefficient of variance, reduction in variance, and relative weight (RW) were applied to test the validity. The results demonstrated that 1,624 OHCS classifications, according to the diagnoses and the procedures performed, showed high homogeneity within groups and heterogeneity between groups. Moreover, the RW of the OHCS could be used to predict and control the production costs. In conclusion, this new OHCS casemix classification has a potential use in a global decision making. PMID:20936134

  9. New casemix classification as an alternative method for budget allocation in thai oral healthcare service: a pilot study.

    PubMed

    Wisaijohn, Thunthita; Pimkhaokham, Atiphan; Lapying, Phenkhae; Itthichaisri, Chumpot; Pannarunothai, Supasit; Igarashi, Isao; Kawabuchi, Koichi

    2010-01-01

    This study aimed to develop a new casemix classification system as an alternative method for the budget allocation of oral healthcare service (OHCS). Initially, the International Statistical of Diseases and Related Health Problem, 10th revision, Thai Modification (ICD-10-TM) related to OHCS was used for developing the software "Grouper". This model was designed to allow the translation of dental procedures into eight-digit codes. Multiple regression analysis was used to analyze the relationship between the factors used for developing the model and the resource consumption. Furthermore, the coefficient of variance, reduction in variance, and relative weight (RW) were applied to test the validity. The results demonstrated that 1,624 OHCS classifications, according to the diagnoses and the procedures performed, showed high homogeneity within groups and heterogeneity between groups. Moreover, the RW of the OHCS could be used to predict and control the production costs. In conclusion, this new OHCS casemix classification has a potential use in a global decision making.

  10. The role of identity in the DSM-5 classification of personality disorders.

    PubMed

    Schmeck, Klaus; Schlüter-Müller, Susanne; Foelsch, Pamela A; Doering, Stephan

    2013-07-31

    In the revised Diagnostic and Statistical Manual DSM-5 the definition of personality disorder diagnoses has not been changed from that in the DSM-IV-TR. However, an alternative model for diagnosing personality disorders where the construct "identity" has been integrated as a central diagnostic criterion for personality disorders has been placed in section III of the manual. The alternative model's hybrid nature leads to the simultaneous use of diagnoses and the newly developed "Level of Personality Functioning-Scale" (a dimensional tool to define the severity of the disorder). Pathological personality traits are assessed in five broad domains which are divided into 25 trait facets. With this dimensional approach, the new classification system gives, both clinicians and researchers, the opportunity to describe the patient in much more detail than previously possible. The relevance of identity problems in assessing and understanding personality pathology is illustrated using the new classification system applied in two case examples of adolescents with a severe personality disorder.

  11. Activation of Methane by FeO+: Determining Reaction Pathways through Temperature-Dependent Kinetics and Statistical Modeling (Postprint)

    DTIC Science & Technology

    2014-02-25

    benchmarks for the reaction surface. ■ INTRODUCTION There is significant interest in procuring and employing natural gas as a viable alternative to...petroleum for both energy and chemical feed stocks.1,2 One of the primary impediments to natural gas utilization is that methane (∼90% of natural gas ...is significant, which typically limits its use to areas where large natural gas deposits are in very close proximity, neglecting the many smaller

  12. Model averaging in the presence of structural uncertainty about treatment effects: influence on treatment decision and expected value of information.

    PubMed

    Price, Malcolm J; Welton, Nicky J; Briggs, Andrew H; Ades, A E

    2011-01-01

    Standard approaches to estimation of Markov models with data from randomized controlled trials tend either to make a judgment about which transition(s) treatments act on, or they assume that treatment has a separate effect on every transition. An alternative is to fit a series of models that assume that treatment acts on specific transitions. Investigators can then choose among alternative models using goodness-of-fit statistics. However, structural uncertainty about any chosen parameterization will remain and this may have implications for the resulting decision and the need for further research. We describe a Bayesian approach to model estimation, and model selection. Structural uncertainty about which parameterization to use is accounted for using model averaging and we developed a formula for calculating the expected value of perfect information (EVPI) in averaged models. Marginal posterior distributions are generated for each of the cost-effectiveness parameters using Markov Chain Monte Carlo simulation in WinBUGS, or Monte-Carlo simulation in Excel (Microsoft Corp., Redmond, WA). We illustrate the approach with an example of treatments for asthma using aggregate-level data from a connected network of four treatments compared in three pair-wise randomized controlled trials. The standard errors of incremental net benefit using structured models is reduced by up to eight- or ninefold compared to the unstructured models, and the expected loss attaching to decision uncertainty by factors of several hundreds. Model averaging had considerable influence on the EVPI. Alternative structural assumptions can alter the treatment decision and have an overwhelming effect on model uncertainty and expected value of information. Structural uncertainty can be accounted for by model averaging, and the EVPI can be calculated for averaged models. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  13. The clinical inadequacy of the DSM-5 classification of somatic symptom and related disorders: an alternative trans-diagnostic model.

    PubMed

    Cosci, Fiammetta; Fava, Giovanni A

    2016-08-01

    The Diagnostic and Statistical of Mental Disorders, Fifth Edition (DSM-5) somatic symptom and related disorders chapter has a limited clinical utility. In addition to the problems that the single diagnostic rubrics and the deletion of the diagnosis of hypochondriasis entail, there are 2 major ambiguities: (1) the use of the term "somatic symptoms" reflects an ill-defined concept of somatization and (2) abnormal illness behavior is included in all diagnostic rubrics, but it is never conceptually defined. In the present review of the literature, we will attempt to approach the clinical issue from a different angle, by introducing the trans-diagnostic viewpoint of illness behavior and propose an alternative clinimetric classification system, based on the Diagnostic Criteria for Psychosomatic Research.

  14. TAPAS: tools to assist the targeted protein quantification of human alternative splice variants.

    PubMed

    Yang, Jae-Seong; Sabidó, Eduard; Serrano, Luis; Kiel, Christina

    2014-10-15

    In proteomes of higher eukaryotes, many alternative splice variants can only be detected by their shared peptides. This makes it highly challenging to use peptide-centric mass spectrometry to distinguish and to quantify protein isoforms resulting from alternative splicing events. We have developed two complementary algorithms based on linear mathematical models to efficiently compute a minimal set of shared and unique peptides needed to quantify a set of isoforms and splice variants. Further, we developed a statistical method to estimate the splice variant abundances based on stable isotope labeled peptide quantities. The algorithms and databases are integrated in a web-based tool, and we have experimentally tested the limits of our quantification method using spiked proteins and cell extracts. The TAPAS server is available at URL http://davinci.crg.es/tapas/. luis.serrano@crg.eu or christina.kiel@crg.eu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Alternative model for a pediatric trauma center: efficient use of physician manpower at a freestanding children's hospital.

    PubMed

    Vernon, Donald D; Bolte, Robert G; Scaife, Eric; Hansen, Kristine W

    2005-01-01

    Freestanding children's hospitals may lack resources, especially surgical manpower, to meet American College of Surgeons trauma center criteria, and may organize trauma care in alternative ways. At a tertiary care children's hospital, attending trauma surgeons and anesthesiologists took out-of-hospital call and directed initial care for only the most severely injured patients, whereas pediatric emergency physicians directed care for patients with less severe injuries. Survival data were analyzed using TRISS methodology. A total of 903 trauma patients were seen by the system during the period 10/1/96-6/30/01. Median Injury Severity Score was 16, and 508 of patients had Injury Severity Score > or =15. There were 83 deaths, 21 unexpected survivors, and 13 unexpected deaths. TRISS analysis showed that z-score was 4.39 and W-statistic was 3.07. Mortality outcome from trauma in a pediatric hospital using this alternative approach to trauma care was significantly better than predicted by TRISS methodology.

  16. Beyond statistical inference: A decision theory for science

    PubMed Central

    KILLEEN, PETER R.

    2008-01-01

    Traditional null hypothesis significance testing does not yield the probability of the null or its alternative and, therefore, cannot logically ground scientific decisions. The decision theory proposed here calculates the expected utility of an effect on the basis of (1) the probability of replicating it and (2) a utility function on its size. It takes significance tests—which place all value on the replicability of an effect and none on its magnitude—as a special case, one in which the cost of a false positive is revealed to be an order of magnitude greater than the value of a true positive. More realistic utility functions credit both replicability and effect size, integrating them for a single index of merit. The analysis incorporates opportunity cost and is consistent with alternate measures of effect size, such as r2 and information transmission, and with Bayesian model selection criteria. An alternate formulation is functionally equivalent to the formal theory, transparent, and easy to compute. PMID:17201351

  17. Beyond statistical inference: a decision theory for science.

    PubMed

    Killeen, Peter R

    2006-08-01

    Traditional null hypothesis significance testing does not yield the probability of the null or its alternative and, therefore, cannot logically ground scientific decisions. The decision theory proposed here calculates the expected utility of an effect on the basis of (1) the probability of replicating it and (2) a utility function on its size. It takes significance tests--which place all value on the replicability of an effect and none on its magnitude--as a special case, one in which the cost of a false positive is revealed to be an order of magnitude greater than the value of a true positive. More realistic utility functions credit both replicability and effect size, integrating them for a single index of merit. The analysis incorporates opportunity cost and is consistent with alternate measures of effect size, such as r2 and information transmission, and with Bayesian model selection criteria. An alternate formulation is functionally equivalent to the formal theory, transparent, and easy to compute.

  18. Estimating extreme river discharges in Europe through a Bayesian network

    NASA Astrophysics Data System (ADS)

    Paprotny, Dominik; Morales-Nápoles, Oswaldo

    2017-06-01

    Large-scale hydrological modelling of flood hazards requires adequate extreme discharge data. In practise, models based on physics are applied alongside those utilizing only statistical analysis. The former require enormous computational power, while the latter are mostly limited in accuracy and spatial coverage. In this paper we introduce an alternate, statistical approach based on Bayesian networks (BNs), a graphical model for dependent random variables. We use a non-parametric BN to describe the joint distribution of extreme discharges in European rivers and variables representing the geographical characteristics of their catchments. Annual maxima of daily discharges from more than 1800 river gauges (stations with catchment areas ranging from 1.4 to 807 000 km2) were collected, together with information on terrain, land use and local climate. The (conditional) correlations between the variables are modelled through copulas, with the dependency structure defined in the network. The results show that using this method, mean annual maxima and return periods of discharges could be estimated with an accuracy similar to existing studies using physical models for Europe and better than a comparable global statistical model. Performance of the model varies slightly between regions of Europe, but is consistent between different time periods, and remains the same in a split-sample validation. Though discharge prediction under climate change is not the main scope of this paper, the BN was applied to a large domain covering all sizes of rivers in the continent both for present and future climate, as an example. Results show substantial variation in the influence of climate change on river discharges. The model can be used to provide quick estimates of extreme discharges at any location for the purpose of obtaining input information for hydraulic modelling.

  19. Cost-effectiveness analysis of mammography and clinical breast examination strategies

    PubMed Central

    Ahern, Charlotte Hsieh; Shen, Yu

    2009-01-01

    Purpose Breast cancer screening by mammography and clinical breast exam are commonly used for early tumor detection. Previous cost-effectiveness studies considered mammography alone or did not account for all relevant costs. In this study, we assessed the cost-effectiveness of screening schedules recommended by three major cancer organizations and compared them with alternative strategies. We considered costs of screening examinations, subsequent work-up, biopsy, and treatment interventions after diagnosis. Methods We used a microsimulation model to generate women’s life histories, and assessed screening and treatment impacts on survival. Using statistical models, we accounted for age-specific incidence, preclinical disease duration, and age-specific sensitivity and specificity for each screening modality. The outcomes of interest were quality-adjusted life years (QALYs) saved and total costs with a 3% annual discount rate. Incremental cost-effectiveness ratios were used to compare strategies. Sensitivity analyses were performed by varying some of the assumptions. Results Compared to guidelines from the National Cancer Institute and the U.S. Preventive Services Task Force, alternative strategies were more efficient. Mammography and clinical breast exam in alternating years from ages 40 to 79 was a cost-effective alternative compared to the guidelines, costing $35,500 per QALY saved compared with no screening. The American Cancer Society guideline was the most effective and the most expensive, costing over $680,000 for an added QALY compared to the above alternative. Conclusion Screening strategies with lower costs and benefits comparable to those currently recommended should be considered for implementation in practice and for future guidelines. PMID:19258473

  20. Statistical Analysis of Large Simulated Yield Datasets for Studying Climate Effects

    NASA Technical Reports Server (NTRS)

    Makowski, David; Asseng, Senthold; Ewert, Frank; Bassu, Simona; Durand, Jean-Louis; Martre, Pierre; Adam, Myriam; Aggarwal, Pramod K.; Angulo, Carlos; Baron, Chritian; hide

    2015-01-01

    Many studies have been carried out during the last decade to study the effect of climate change on crop yields and other key crop characteristics. In these studies, one or several crop models were used to simulate crop growth and development for different climate scenarios that correspond to different projections of atmospheric CO2 concentration, temperature, and rainfall changes (Semenov et al., 1996; Tubiello and Ewert, 2002; White et al., 2011). The Agricultural Model Intercomparison and Improvement Project (AgMIP; Rosenzweig et al., 2013) builds on these studies with the goal of using an ensemble of multiple crop models in order to assess effects of climate change scenarios for several crops in contrasting environments. These studies generate large datasets, including thousands of simulated crop yield data. They include series of yield values obtained by combining several crop models with different climate scenarios that are defined by several climatic variables (temperature, CO2, rainfall, etc.). Such datasets potentially provide useful information on the possible effects of different climate change scenarios on crop yields. However, it is sometimes difficult to analyze these datasets and to summarize them in a useful way due to their structural complexity; simulated yield data can differ among contrasting climate scenarios, sites, and crop models. Another issue is that it is not straightforward to extrapolate the results obtained for the scenarios to alternative climate change scenarios not initially included in the simulation protocols. Additional dynamic crop model simulations for new climate change scenarios are an option but this approach is costly, especially when a large number of crop models are used to generate the simulated data, as in AgMIP. Statistical models have been used to analyze responses of measured yield data to climate variables in past studies (Lobell et al., 2011), but the use of a statistical model to analyze yields simulated by complex process-based crop models is a rather new idea. We demonstrate herewith that statistical methods can play an important role in analyzing simulated yield data sets obtained from the ensembles of process-based crop models. Formal statistical analysis is helpful to estimate the effects of different climatic variables on yield, and to describe the between-model variability of these effects.

  1. Much ado about two: reconsidering retransformation and the two-part model in health econometrics.

    PubMed

    Mullahy, J

    1998-06-01

    In health economics applications involving outcomes (y) and covariates (x), it is often the case that the central inferential problems of interest involve E[y/x] and its associated partial effects or elasticities. Many such outcomes have two fundamental statistical properties: y > or = 0; and the outcome y = 0 is observed with sufficient frequency that the zeros cannot be ignored econometrically. This paper (1) describes circumstances where the standard two-part model with homoskedastic retransformation will fail to provide consistent inferences about important policy parameters; and (2) demonstrates some alternative approaches that are likely to prove helpful in applications.

  2. Experimental model of traumatic ulcer in the cheek mucosa of rats.

    PubMed

    Cavalcante, Galyléia Meneses; Sousa de Paula, Renata Janaína; Souza, Leonardo Peres de; Sousa, Fabrício Bitu; Mota, Mário Rogério Lima; Alves, Ana Paula Negreiros Nunes

    2011-06-01

    To establish an experimental model of traumatic ulcer in rat cheek mucosa for utilization in future alternative therapy studies. A total of 60 adult male rats (250 - 300g) were used. Ulceration of the left cheek mucosa was provoked by abrasion using a nº 15 scalpel blade. The animals were observed for 10 days, during which they were weighed and their ulcers were measured. The histological characteristics were analyzed and scored according to the ulcer phase. In the statistical analysis, a value of p<0.01 was considered a statistically significant response in all cases. During the five first days, the animals lost weight (Student t test, p<0.01). The ulcerated area receded linearly over time and was almost completely cicatrized after 10 days (ANOVA, Tendency posttest, p<0.0001). Groups on days 1, 2 and 3 days displayed similar results, but a decrease in scores were observed after the 4th day. The proposed cheek mucosa ulcer model in rats can be considered an efficient, low-cost, reliable, and reproducible method.

  3. Nonlocal transport in the presence of transport barriers

    NASA Astrophysics Data System (ADS)

    Del-Castillo-Negrete, D.

    2013-10-01

    There is experimental, numerical, and theoretical evidence that transport in plasmas can, under certain circumstances, depart from the standard local, diffusive description. Examples include fast pulse propagation phenomena in perturbative experiments, non-diffusive scaling in L-mode plasmas, and non-Gaussian statistics of fluctuations. From the theoretical perspective, non-diffusive transport descriptions follow from the relaxation of the restrictive assumptions (locality, scale separation, and Gaussian/Markovian statistics) at the foundation of diffusive models. We discuss an alternative class of models able to capture some of the observed non-diffusive transport phenomenology. The models are based on a class of nonlocal, integro-differential operators that provide a unifying framework to describe non- Fickian scale-free transport, and non-Markovian (memory) effects. We study the interplay between nonlocality and internal transport barriers (ITBs) in perturbative transport including cold edge pulses and power modulation. Of particular interest in the nonlocal ``tunnelling'' of perturbations through ITBs. Also, flux-gradient diagrams are discussed as diagnostics to detect nonlocal transport processes in numerical simulations and experiments. Work supported by the US Department of Energy.

  4. Comparison of RF spectrum prediction methods for dynamic spectrum access

    NASA Astrophysics Data System (ADS)

    Kovarskiy, Jacob A.; Martone, Anthony F.; Gallagher, Kyle A.; Sherbondy, Kelly D.; Narayanan, Ram M.

    2017-05-01

    Dynamic spectrum access (DSA) refers to the adaptive utilization of today's busy electromagnetic spectrum. Cognitive radio/radar technologies require DSA to intelligently transmit and receive information in changing environments. Predicting radio frequency (RF) activity reduces sensing time and energy consumption for identifying usable spectrum. Typical spectrum prediction methods involve modeling spectral statistics with Hidden Markov Models (HMM) or various neural network structures. HMMs describe the time-varying state probabilities of Markov processes as a dynamic Bayesian network. Neural Networks model biological brain neuron connections to perform a wide range of complex and often non-linear computations. This work compares HMM, Multilayer Perceptron (MLP), and Recurrent Neural Network (RNN) algorithms and their ability to perform RF channel state prediction. Monte Carlo simulations on both measured and simulated spectrum data evaluate the performance of these algorithms. Generalizing spectrum occupancy as an alternating renewal process allows Poisson random variables to generate simulated data while energy detection determines the occupancy state of measured RF spectrum data for testing. The results suggest that neural networks achieve better prediction accuracy and prove more adaptable to changing spectral statistics than HMMs given sufficient training data.

  5. Statistical physics of media processes: Mediaphysics

    NASA Astrophysics Data System (ADS)

    Kuznetsov, Dmitri V.; Mandel, Igor

    2007-04-01

    The processes of mass communications in complicated social or sociobiological systems such as marketing, economics, politics, animal populations, etc. as a subject for the special scientific subbranch-“mediaphysics”-are considered in its relation with sociophysics. A new statistical physics approach to analyze these phenomena is proposed. A keystone of the approach is an analysis of population distribution between two or many alternatives: brands, political affiliations, or opinions. Relative distances between a state of a “person's mind” and the alternatives are measures of propensity to buy (to affiliate, or to have a certain opinion). The distribution of population by those relative distances is time dependent and affected by external (economic, social, marketing, natural) and internal (influential propagation of opinions, “word of mouth”, etc.) factors, considered as fields. Specifically, the interaction and opinion-influence field can be generalized to incorporate important elements of Ising-spin-based sociophysical models and kinetic-equation ones. The distributions were described by a Schrödinger-type equation in terms of Green's functions. The developed approach has been applied to a real mass-media efficiency problem for a large company and generally demonstrated very good results despite low initial correlations of factors and the target variable.

  6. An evaluation of selected in silico models for the assessment ...

    EPA Pesticide Factsheets

    Skin sensitization remains an important endpoint for consumers, manufacturers and regulators. Although the development of alternative approaches to assess skin sensitization potential has been extremely active over many years, the implication of regulations such as REACH and the Cosmetics Directive in EU has provided a much stronger impetus to actualize this research into practical tools for decision making. Thus there has been considerable focus on the development, evaluation, and integration of alternative approaches for skin sensitization hazard and risk assessment. This includes in silico approaches such as (Q)SARs and expert systems. This study aimed to evaluate the predictive performance of a selection of in silico models and then to explore whether combining those models led to an improvement in accuracy. A dataset of 473 substances that had been tested in the local lymph node assay (LLNA) was compiled. This comprised 295 sensitizers and 178 non-sensitizers. Four freely available models were identified - 2 statistical models VEGA and MultiCASE model A33 for skin sensitization (MCASE A33) from the Danish National Food Institute and two mechanistic models Toxtree’s Skin sensitization Reaction domains (Toxtree SS Rxn domains) and the OASIS v1.3 protein binding alerts for skin sensitization from the OECD Toolbox (OASIS). VEGA and MCASE A33 aim to predict sensitization as a binary score whereas the mechanistic models identified reaction domains or structura

  7. Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques

    NASA Astrophysics Data System (ADS)

    Mishra, D.; Goyal, P.

    2014-12-01

    Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.

  8. Lattice QCD Thermodynamics and RHIC-BES Particle Production within Generic Nonextensive Statistics

    NASA Astrophysics Data System (ADS)

    Tawfik, Abdel Nasser

    2018-05-01

    The current status of implementing Tsallis (nonextensive) statistics on high-energy physics is briefly reviewed. The remarkably low freezeout-temperature, which apparently fails to reproduce the firstprinciple lattice QCD thermodynamics and the measured particle ratios, etc. is discussed. The present work suggests a novel interpretation for the so-called " Tsallis-temperature". It is proposed that the low Tsallis-temperature is due to incomplete implementation of Tsallis algebra though exponential and logarithmic functions to the high-energy particle-production. Substituting Tsallis algebra into grand-canonical partition-function of the hadron resonance gas model seems not assuring full incorporation of nonextensivity or correlations in that model. The statistics describing the phase-space volume, the number of states and the possible changes in the elementary cells should be rather modified due to interacting correlated subsystems, of which the phase-space is consisting. Alternatively, two asymptotic properties, each is associated with a scaling function, are utilized to classify a generalized entropy for such a system with large ensemble (produced particles) and strong correlations. Both scaling exponents define equivalence classes for all interacting and noninteracting systems and unambiguously characterize any statistical system in its thermodynamic limit. We conclude that the nature of lattice QCD simulations is apparently extensive and accordingly the Boltzmann-Gibbs statistics is fully fulfilled. Furthermore, we found that the ratios of various particle yields at extreme high and extreme low energies of RHIC-BES is likely nonextensive but not necessarily of Tsallis type.

  9. Clinical and Economic Evaluation of Treatment Strategies for T1N0 Anal Canal Cancer.

    PubMed

    Deshmukh, Ashish A; Zhao, Hui; Das, Prajnan; Chiao, Elizabeth Y; You, Yi-Qian Nancy; Franzini, Luisa; Lairson, David R; Swartz, Michael D; Giordano, Sharon H; Cantor, Scott B

    2018-07-01

    A comparative assessment of treatment alternatives for T1N0 anal canal cancer has never been conducted. We compared the outcomes associated with the treatment alternatives-chemoradiotherapy (CRT), radiotherapy (RT), and surgery or ablation techniques (surgery/ablation)-for T1N0 anal canal cancer. This retrospective cohort study was conducted using the Surveillance, Epidemiology and End Results (SEER) registries linked with Medicare longitudinal data (SEER-Medicare database). Analysis included 190 patients who were treated for T1N0 anal canal cancer using surgery/ablation (n=44), RT (n=50), or CRT (n=96). The outcomes were reported in terms of survival and hazards ratios using Kaplan-Meier and Cox proportional hazards modeling, respectively; lifetime costs; and cost-effectiveness measured in terms of incremental cost-effectiveness ratio, that is, the ratio of the difference in costs between the 2 alternatives to the difference in effectiveness between the same 2 alternatives. There was no significant difference in the survival duration between the treatment groups as predicted by the Kaplan-Meier curves. After adjusting for patient characteristics and propensity score, the hazard ratio of death for the patients who received CRT compared with surgery/ablation was 1.742 (95% confidence interval, 0.793-3.829) and RT was 2.170 (95% confidence interval, 0.923-5.101); however, the relationship did not reach statistical significance. Surgery/ablation resulted in lower lifetime cost than RT or CRT. The incremental cost-effectiveness ratio associated with CRT compared with surgery/ablation was $142,883 per life year gained. There was no statistically significant difference in survival among the treatment alternatives for T1N0 anal canal cancer. Given that surgery/ablation costs less than RT or CRT and might be cost-effective compared with RT and CRT, it is crucial to explore this finding further in this era of limited health care resources.

  10. Cosmology constraints from shear peak statistics in Dark Energy Survey Science Verification data

    DOE PAGES

    Kacprzak, T.; Kirk, D.; Friedrich, O.; ...

    2016-08-19

    Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less

  11. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  12. Estimating current and future streamflow characteristics at ungaged sites, central and eastern Montana, with application to evaluating effects of climate change on fish populations

    USGS Publications Warehouse

    Sando, Roy; Chase, Katherine J.

    2017-03-23

    A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.

  13. Illustrating Sampling Distribution of a Statistic: Minitab Revisited

    ERIC Educational Resources Information Center

    Johnson, H. Dean; Evans, Marc A.

    2008-01-01

    Understanding the concept of the sampling distribution of a statistic is essential for the understanding of inferential procedures. Unfortunately, this topic proves to be a stumbling block for students in introductory statistics classes. In efforts to aid students in their understanding of this concept, alternatives to a lecture-based mode of…

  14. Approximations to the distribution of a test statistic in covariance structure analysis: A comprehensive study.

    PubMed

    Wu, Hao

    2018-05-01

    In structural equation modelling (SEM), a robust adjustment to the test statistic or to its reference distribution is needed when its null distribution deviates from a χ 2 distribution, which usually arises when data do not follow a multivariate normal distribution. Unfortunately, existing studies on this issue typically focus on only a few methods and neglect the majority of alternative methods in statistics. Existing simulation studies typically consider only non-normal distributions of data that either satisfy asymptotic robustness or lead to an asymptotic scaled χ 2 distribution. In this work we conduct a comprehensive study that involves both typical methods in SEM and less well-known methods from the statistics literature. We also propose the use of several novel non-normal data distributions that are qualitatively different from the non-normal distributions widely used in existing studies. We found that several under-studied methods give the best performance under specific conditions, but the Satorra-Bentler method remains the most viable method for most situations. © 2017 The British Psychological Society.

  15. Use of Tests of Statistical Significance and Other Analytic Choices in a School Psychology Journal: Review of Practices and Suggested Alternatives.

    ERIC Educational Resources Information Center

    Snyder, Patricia A.; Thompson, Bruce

    The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association…

  16. Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

    PubMed Central

    Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

    2009-01-01

    Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660

  17. Entraining IDyOT: Timing in the Information Dynamics of Thinking

    PubMed Central

    Forth, Jamie; Agres, Kat; Purver, Matthew; Wiggins, Geraint A.

    2016-01-01

    We present a novel hypothetical account of entrainment in music and language, in context of the Information Dynamics of Thinking model, IDyOT. The extended model affords an alternative view of entrainment, and its companion term, pulse, from earlier accounts. The model is based on hierarchical, statistical prediction, modeling expectations of both what an event will be and when it will happen. As such, it constitutes a kind of predictive coding, with a particular novel hypothetical implementation. Here, we focus on the model's mechanism for predicting when a perceptual event will happen, given an existing sequence of past events, which may be musical or linguistic. We propose a range of tests to validate or falsify the model, at various different levels of abstraction, and argue that computational modeling in general, and this model in particular, can offer a means of providing limited but useful evidence for evolutionary hypotheses. PMID:27803682

  18. Transgenic mice as an alternative to monkeys for neurovirulence testing of live oral poliovirus vaccine: validation by a WHO collaborative study.

    PubMed Central

    Dragunsky, Eugenia; Nomura, Tatsuji; Karpinski, Kazimir; Furesz, John; Wood, David J.; Pervikov, Yuri; Abe, Shinobu; Kurata, Takeshi; Vanloocke, Olivier; Karganova, Galina; Taffs, Rolf; Heath, Alan; Ivshina, Anna; Levenbook, Inessa

    2003-01-01

    OBJECTIVE: Extensive WHO collaborative studies were performed to evaluate the suitability of transgenic mice susceptible to poliovirus (TgPVR mice, strain 21, bred and provided by the Central Institute for Experimental Animals, Japan) as an alternative to monkeys in the neurovirulence test (NVT) of oral poliovirus vaccine (OPV). METHODS: Nine laboratories participated in the collaborative study on testing neurovirulence of 94 preparations of OPV and vaccine derivatives of all three serotypes in TgPVR21 mice. FINDINGS: Statistical analysis of the data demonstrated that the TgPVR21 mouse NVT was of comparable sensitivity and reproducibility to the conventional WHO NVT in simians. A statistical model for acceptance/rejection of OPV lots in the mouse test was developed, validated, and shown to be suitable for all three vaccine types. The assessment of the transgenic mouse NVT is based on clinical evaluation of paralysed mice. Unlike the monkey NVT, histological examination of central nervous system tissue of each mouse offered no advantage over careful and detailed clinical observation. CONCLUSIONS: Based on data from the collaborative studies the WHO Expert Committee for Biological Standardization approved the mouse NVT as an alternative to the monkey test for all three OPV types and defined a standard implementation process for laboratories that wish to use the test. This represents the first successful introduction of transgenic animals into control of biologicals. PMID:12764491

  19. Parasol cell mosaics are unlikely to drive the formation of structured orientation maps in primary visual cortex.

    PubMed

    Hore, Victoria R A; Troy, John B; Eglen, Stephen J

    2012-11-01

    The receptive fields of on- and off-center parasol cell mosaics independently tile the retina to ensure efficient sampling of visual space. A recent theoretical model represented the on- and off-center mosaics by noisy hexagonal lattices of slightly different density. When the two lattices are overlaid, long-range Moiré interference patterns are generated. These Moiré interference patterns have been suggested to drive the formation of highly structured orientation maps in visual cortex. Here, we show that noisy hexagonal lattices do not capture the spatial statistics of parasol cell mosaics. An alternative model based upon local exclusion zones, termed as the pairwise interaction point process (PIPP) model, generates patterns that are statistically indistinguishable from parasol cell mosaics. A key difference between the PIPP model and the hexagonal lattice model is that the PIPP model does not generate Moiré interference patterns, and hence stimulated orientation maps do not show any hexagonal structure. Finally, we estimate the spatial extent of spatial correlations in parasol cell mosaics to be only 200-350 μm, far less than that required to generate Moiré interference. We conclude that parasol cell mosaics are too disordered to drive the formation of highly structured orientation maps in visual cortex.

  20. Pattern Adaptation and Normalization Reweighting.

    PubMed

    Westrick, Zachary M; Heeger, David J; Landy, Michael S

    2016-09-21

    Adaptation to an oriented stimulus changes both the gain and preferred orientation of neural responses in V1. Neurons tuned near the adapted orientation are suppressed, and their preferred orientations shift away from the adapter. We propose a model in which weights of divisive normalization are dynamically adjusted to homeostatically maintain response products between pairs of neurons. We demonstrate that this adjustment can be performed by a very simple learning rule. Simulations of this model closely match existing data from visual adaptation experiments. We consider several alternative models, including variants based on homeostatic maintenance of response correlations or covariance, as well as feedforward gain-control models with multiple layers, and we demonstrate that homeostatic maintenance of response products provides the best account of the physiological data. Adaptation is a phenomenon throughout the nervous system in which neural tuning properties change in response to changes in environmental statistics. We developed a model of adaptation that combines normalization (in which a neuron's gain is reduced by the summed responses of its neighbors) and Hebbian learning (in which synaptic strength, in this case divisive normalization, is increased by correlated firing). The model is shown to account for several properties of adaptation in primary visual cortex in response to changes in the statistics of contour orientation. Copyright © 2016 the authors 0270-6474/16/369805-12$15.00/0.

  1. Prospective and participatory integrated assessment of agricultural systems from farm to regional scales: Comparison of three modeling approaches.

    PubMed

    Delmotte, Sylvestre; Lopez-Ridaura, Santiago; Barbier, Jean-Marc; Wery, Jacques

    2013-11-15

    Evaluating the impacts of the development of alternative agricultural systems, such as organic or low-input cropping systems, in the context of an agricultural region requires the use of specific tools and methodologies. They should allow a prospective (using scenarios), multi-scale (taking into account the field, farm and regional level), integrated (notably multicriteria) and participatory assessment, abbreviated PIAAS (for Participatory Integrated Assessment of Agricultural System). In this paper, we compare the possible contribution to PIAAS of three modeling approaches i.e. Bio-Economic Modeling (BEM), Agent-Based Modeling (ABM) and statistical Land-Use/Land Cover Change (LUCC) models. After a presentation of each approach, we analyze their advantages and drawbacks, and identify their possible complementarities for PIAAS. Statistical LUCC modeling is a suitable approach for multi-scale analysis of past changes and can be used to start discussion about the futures with stakeholders. BEM and ABM approaches have complementary features for scenarios assessment at different scales. While ABM has been widely used for participatory assessment, BEM has been rarely used satisfactorily in a participatory manner. On the basis of these results, we propose to combine these three approaches in a framework targeted to PIAAS. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Quantum-Like Bayesian Networks for Modeling Decision Making

    PubMed Central

    Moreira, Catarina; Wichert, Andreas

    2016-01-01

    In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669

  3. Organism-level models: When mechanisms and statistics fail us

    NASA Astrophysics Data System (ADS)

    Phillips, M. H.; Meyer, J.; Smith, W. P.; Rockhill, J. K.

    2014-03-01

    Purpose: To describe the unique characteristics of models that represent the entire course of radiation therapy at the organism level and to highlight the uses to which such models can be put. Methods: At the level of an organism, traditional model-building runs into severe difficulties. We do not have sufficient knowledge to devise a complete biochemistry-based model. Statistical model-building fails due to the vast number of variables and the inability to control many of them in any meaningful way. Finally, building surrogate models, such as animal-based models, can result in excluding some of the most critical variables. Bayesian probabilistic models (Bayesian networks) provide a useful alternative that have the advantages of being mathematically rigorous, incorporating the knowledge that we do have, and being practical. Results: Bayesian networks representing radiation therapy pathways for prostate cancer and head & neck cancer were used to highlight the important aspects of such models and some techniques of model-building. A more specific model representing the treatment of occult lymph nodes in head & neck cancer were provided as an example of how such a model can inform clinical decisions. A model of the possible role of PET imaging in brain cancer was used to illustrate the means by which clinical trials can be modelled in order to come up with a trial design that will have meaningful outcomes. Conclusions: Probabilistic models are currently the most useful approach to representing the entire therapy outcome process.

  4. Knowledge, Attitude and Practice of General Practitioners toward Complementary and Alternative Medicine: a Cross-Sectional Study.

    PubMed

    Barikani, Ameneh; Beheshti, Akram; Javadi, Maryam; Yasi, Marzieh

    2015-08-01

    Orientation of public and physicians to the complementary and alternative medicine (CAM) is one of the most prominent symbols of structural changes in the health service system. The aim of his study was a determination of knowledge, attitude, and practice of general practitioners in complementary and alternative medicine. This cross- sectional study was conducted in Qazvin, Iran in 2013. A self-administered questionnaire was used for collecting data including four information parts: population information, physicians' attitude and knowledge, methods of getting information and their function. A total of 228 physicians in Qazvin comprised the population of study according to the deputy of treatment's report of Qazvin University of Medical Sciences. A total of 150 physicians were selected randomly, and SPSS Statistical program was used to enter questionnaires' data. Results were analyzed as descriptive statistics and statistical analysis. Sixty percent of all responders were male. About sixty (59.4) percent of participating practitioners had worked less than 10 years.96.4 percent had a positive attitude towards complementary and alternative medicine. Knowledge of practitioners about traditional medicine in 11 percent was good, 36.3% and 52.7% had average and little information, respectively. 17.9% of practitioners offered their patients complementary and alternative medicine for treatment. Although there was little knowledge among practitioners about traditional medicine and complementary approaches, a significant percentage of them had attitude higher than the lower limit.

  5. Simulation study to determine the impact of different design features on design efficiency in discrete choice experiments.

    PubMed

    Vanniyasingam, Thuva; Cunningham, Charles E; Foster, Gary; Thabane, Lehana

    2016-07-19

    Discrete choice experiments (DCEs) are routinely used to elicit patient preferences to improve health outcomes and healthcare services. While many fractional factorial designs can be created, some are more statistically optimal than others. The objective of this simulation study was to investigate how varying the number of (1) attributes, (2) levels within attributes, (3) alternatives and (4) choice tasks per survey will improve or compromise the statistical efficiency of an experimental design. A total of 3204 DCE designs were created to assess how relative design efficiency (d-efficiency) is influenced by varying the number of choice tasks (2-20), alternatives (2-5), attributes (2-20) and attribute levels (2-5) of a design. Choice tasks were created by randomly allocating attribute and attribute level combinations into alternatives. Relative d-efficiency was used to measure the optimality of each DCE design. DCE design complexity influenced statistical efficiency. Across all designs, relative d-efficiency decreased as the number of attributes and attribute levels increased. It increased for designs with more alternatives. Lastly, relative d-efficiency converges as the number of choice tasks increases, where convergence may not be at 100% statistical optimality. Achieving 100% d-efficiency is heavily dependent on the number of attributes, attribute levels, choice tasks and alternatives. Further exploration of overlaps and block sizes are needed. This study's results are widely applicable for researchers interested in creating optimal DCE designs to elicit individual preferences on health services, programmes, policies and products. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  6. Diagnosis of students' ability in a statistical course based on Rasch probabilistic outcome

    NASA Astrophysics Data System (ADS)

    Mahmud, Zamalia; Ramli, Wan Syahira Wan; Sapri, Shamsiah; Ahmad, Sanizah

    2017-06-01

    Measuring students' ability and performance are important in assessing how well students have learned and mastered the statistical courses. Any improvement in learning will depend on the student's approaches to learning, which are relevant to some factors of learning, namely assessment methods carrying out tasks consisting of quizzes, tests, assignment and final examination. This study has attempted an alternative approach to measure students' ability in an undergraduate statistical course based on the Rasch probabilistic model. Firstly, this study aims to explore the learning outcome patterns of students in a statistics course (Applied Probability and Statistics) based on an Entrance-Exit survey. This is followed by investigating students' perceived learning ability based on four Course Learning Outcomes (CLOs) and students' actual learning ability based on their final examination scores. Rasch analysis revealed that students perceived themselves as lacking the ability to understand about 95% of the statistics concepts at the beginning of the class but eventually they had a good understanding at the end of the 14 weeks class. In terms of students' performance in their final examination, their ability in understanding the topics varies at different probability values given the ability of the students and difficulty of the questions. Majority found the probability and counting rules topic to be the most difficult to learn.

  7. Advances in the meta-analysis of heterogeneous clinical trials I: The inverse variance heterogeneity model.

    PubMed

    Doi, Suhail A R; Barendregt, Jan J; Khan, Shahjahan; Thalib, Lukman; Williams, Gail M

    2015-11-01

    This article examines an improved alternative to the random effects (RE) model for meta-analysis of heterogeneous studies. It is shown that the known issues of underestimation of the statistical error and spuriously overconfident estimates with the RE model can be resolved by the use of an estimator under the fixed effect model assumption with a quasi-likelihood based variance structure - the IVhet model. Extensive simulations confirm that this estimator retains a correct coverage probability and a lower observed variance than the RE model estimator, regardless of heterogeneity. When the proposed IVhet method is applied to the controversial meta-analysis of intravenous magnesium for the prevention of mortality after myocardial infarction, the pooled OR is 1.01 (95% CI 0.71-1.46) which not only favors the larger studies but also indicates more uncertainty around the point estimate. In comparison, under the RE model the pooled OR is 0.71 (95% CI 0.57-0.89) which, given the simulation results, reflects underestimation of the statistical error. Given the compelling evidence generated, we recommend that the IVhet model replace both the FE and RE models. To facilitate this, it has been implemented into free meta-analysis software called MetaXL which can be downloaded from www.epigear.com. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Acoustic Analogy and Alternative Theories for Jet Noise Prediction

    NASA Technical Reports Server (NTRS)

    Morris, Philip J.; Farassat, F.

    2002-01-01

    Several methods for the prediction of jet noise are described. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy, whereas the other is the jet noise generation model recently proposed by Tam and Auriault. In all of the approaches, some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier-Stokes equation using a kappa-sigma turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach, but instead is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. In conclusion, a proposal is presented for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms, as is a discussion of noise prediction issues that remain to be resolved.

  9. The Acoustic Analogy and Alternative Theories for Jet Noise Prediction

    NASA Technical Reports Server (NTRS)

    Morris, Philip J.; Farassat, F.; Morris, Philip J.

    2002-01-01

    This paper describes several methods for the prediction of jet noise. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy while the other is the jet noise generation model recently proposed by Tam and Auriault. In all the approaches some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier Stokes equation using a k-epsilon turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach: but, is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. The paper concludes with a proposal for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms and a discussion of noise prediction issues that remain to be resolved.

  10. The Acoustic Analogy and Alternative Theories for Jet Noise Prediction

    NASA Technical Reports Server (NTRS)

    Morris, Philip J.; Farassat, F.

    2002-01-01

    This paper describes several methods for the prediction of jet noise. All but one of the noise prediction schemes are based on Lighthill's or Lilley's acoustic analogy while the other is the jet noise generation model recently proposed by Tam and Auriault. In all the approaches some assumptions must be made concerning the statistical properties of the turbulent sources. In each case the characteristic scales of the turbulence are obtained from a solution of the Reynolds-averaged Navier Stokes equation using a k - epsilon turbulence model. It is shown that, for the same level of empiricism, Tam and Auriault's model yields better agreement with experimental noise measurements than the acoustic analogy. It is then shown that this result is not because of some fundamental flaw in the acoustic analogy approach: but, is associated with the assumptions made in the approximation of the turbulent source statistics. If consistent assumptions are made, both the acoustic analogy and Tam and Auriault's model yield identical noise predictions. The paper concludes with a proposal for an acoustic analogy that provides a clearer identification of the equivalent source mechanisms and a discussion of noise prediction issues that remain to be resolved.

  11. A brief introduction to computer-intensive methods, with a view towards applications in spatial statistics and stereology.

    PubMed

    Mattfeldt, Torsten

    2011-04-01

    Computer-intensive methods may be defined as data analytical procedures involving a huge number of highly repetitive computations. We mention resampling methods with replacement (bootstrap methods), resampling methods without replacement (randomization tests) and simulation methods. The resampling methods are based on simple and robust principles and are largely free from distributional assumptions. Bootstrap methods may be used to compute confidence intervals for a scalar model parameter and for summary statistics from replicated planar point patterns, and for significance tests. For some simple models of planar point processes, point patterns can be simulated by elementary Monte Carlo methods. The simulation of models with more complex interaction properties usually requires more advanced computing methods. In this context, we mention simulation of Gibbs processes with Markov chain Monte Carlo methods using the Metropolis-Hastings algorithm. An alternative to simulations on the basis of a parametric model consists of stochastic reconstruction methods. The basic ideas behind the methods are briefly reviewed and illustrated by simple worked examples in order to encourage novices in the field to use computer-intensive methods. © 2010 The Authors Journal of Microscopy © 2010 Royal Microscopical Society.

  12. Quantification of downscaled precipitation uncertainties via Bayesian inference

    NASA Astrophysics Data System (ADS)

    Nury, A. H.; Sharma, A.; Marshall, L. A.

    2017-12-01

    Prediction of precipitation from global climate model (GCM) outputs remains critical to decision-making in water-stressed regions. In this regard, downscaling of GCM output has been a useful tool for analysing future hydro-climatological states. Several downscaling approaches have been developed for precipitation downscaling, including those using dynamical or statistical downscaling methods. Frequently, outputs from dynamical downscaling are not readily transferable across regions for significant methodical and computational difficulties. Statistical downscaling approaches provide a flexible and efficient alternative, providing hydro-climatological outputs across multiple temporal and spatial scales in many locations. However these approaches are subject to significant uncertainty, arising due to uncertainty in the downscaled model parameters and in the use of different reanalysis products for inferring appropriate model parameters. Consequently, these will affect the performance of simulation in catchment scale. This study develops a Bayesian framework for modelling downscaled daily precipitation from GCM outputs. This study aims to introduce uncertainties in downscaling evaluating reanalysis datasets against observational rainfall data over Australia. In this research a consistent technique for quantifying downscaling uncertainties by means of Bayesian downscaling frame work has been proposed. The results suggest that there are differences in downscaled precipitation occurrences and extremes.

  13. Applying the compound Poisson process model to the reporting of injury-related mortality rates.

    PubMed

    Kegler, Scott R

    2007-02-16

    Injury-related mortality rate estimates are often analyzed under the assumption that case counts follow a Poisson distribution. Certain types of injury incidents occasionally involve multiple fatalities, however, resulting in dependencies between cases that are not reflected in the simple Poisson model and which can affect even basic statistical analyses. This paper explores the compound Poisson process model as an alternative, emphasizing adjustments to some commonly used interval estimators for population-based rates and rate ratios. The adjusted estimators involve relatively simple closed-form computations, which in the absence of multiple-case incidents reduce to familiar estimators based on the simpler Poisson model. Summary data from the National Violent Death Reporting System are referenced in several examples demonstrating application of the proposed methodology.

  14. Improvements in sub-grid, microphysics averages using quadrature based approaches

    NASA Astrophysics Data System (ADS)

    Chowdhary, K.; Debusschere, B.; Larson, V. E.

    2013-12-01

    Sub-grid variability in microphysical processes plays a critical role in atmospheric climate models. In order to account for this sub-grid variability, Larson and Schanen (2013) propose placing a probability density function on the sub-grid cloud microphysics quantities, e.g. autoconversion rate, essentially interpreting the cloud microphysics quantities as a random variable in each grid box. Random sampling techniques, e.g. Monte Carlo and Latin Hypercube, can be used to calculate statistics, e.g. averages, on the microphysics quantities, which then feed back into the model dynamics on the coarse scale. We propose an alternate approach using numerical quadrature methods based on deterministic sampling points to compute the statistical moments of microphysics quantities in each grid box. We have performed a preliminary test on the Kessler autoconversion formula, and, upon comparison with Latin Hypercube sampling, our approach shows an increased level of accuracy with a reduction in sample size by almost two orders of magnitude. Application to other microphysics processes is the subject of ongoing research.

  15. An Electrochemical Impedance Spectroscopy-Based Technique to Identify and Quantify Fermentable Sugars in Pineapple Waste Valorization for Bioethanol Production

    PubMed Central

    Conesa, Claudia; García-Breijo, Eduardo; Loeff, Edwin; Seguí, Lucía; Fito, Pedro; Laguarda-Miró, Nicolás

    2015-01-01

    Electrochemical Impedance Spectroscopy (EIS) has been used to develop a methodology able to identify and quantify fermentable sugars present in the enzymatic hydrolysis phase of second-generation bioethanol production from pineapple waste. Thus, a low-cost non-destructive system consisting of a stainless double needle electrode associated to an electronic equipment that allows the implementation of EIS was developed. In order to validate the system, different concentrations of glucose, fructose and sucrose were added to the pineapple waste and analyzed both individually and in combination. Next, statistical data treatment enabled the design of specific Artificial Neural Networks-based mathematical models for each one of the studied sugars and their respective combinations. The obtained prediction models are robust and reliable and they are considered statistically valid (CCR% > 93.443%). These results allow us to introduce this EIS-based technique as an easy, fast, non-destructive, and in-situ alternative to the traditional laboratory methods for enzymatic hydrolysis monitoring. PMID:26378537

  16. Introduction to statistical modelling 2: categorical variables and interactions in linear regression.

    PubMed

    Lunt, Mark

    2015-07-01

    In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. Extracting multistage screening rules from online dating activity data.

    PubMed

    Bruch, Elizabeth; Feinberg, Fred; Lee, Kee Yeun

    2016-09-20

    This paper presents a statistical framework for harnessing online activity data to better understand how people make decisions. Building on insights from cognitive science and decision theory, we develop a discrete choice model that allows for exploratory behavior and multiple stages of decision making, with different rules enacted at each stage. Critically, the approach can identify if and when people invoke noncompensatory screeners that eliminate large swaths of alternatives from detailed consideration. The model is estimated using deidentified activity data on 1.1 million browsing and writing decisions observed on an online dating site. We find that mate seekers enact screeners ("deal breakers") that encode acceptability cutoffs. A nonparametric account of heterogeneity reveals that, even after controlling for a host of observable attributes, mate evaluation differs across decision stages as well as across identified groupings of men and women. Our statistical framework can be widely applied in analyzing large-scale data on multistage choices, which typify searches for "big ticket" items.

  18. Extracting multistage screening rules from online dating activity data

    PubMed Central

    Bruch, Elizabeth; Feinberg, Fred; Lee, Kee Yeun

    2016-01-01

    This paper presents a statistical framework for harnessing online activity data to better understand how people make decisions. Building on insights from cognitive science and decision theory, we develop a discrete choice model that allows for exploratory behavior and multiple stages of decision making, with different rules enacted at each stage. Critically, the approach can identify if and when people invoke noncompensatory screeners that eliminate large swaths of alternatives from detailed consideration. The model is estimated using deidentified activity data on 1.1 million browsing and writing decisions observed on an online dating site. We find that mate seekers enact screeners (“deal breakers”) that encode acceptability cutoffs. A nonparametric account of heterogeneity reveals that, even after controlling for a host of observable attributes, mate evaluation differs across decision stages as well as across identified groupings of men and women. Our statistical framework can be widely applied in analyzing large-scale data on multistage choices, which typify searches for “big ticket” items. PMID:27578870

  19. Editorial: Bayesian benefits for child psychology and psychiatry researchers.

    PubMed

    Oldehinkel, Albertine J

    2016-09-01

    For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.

  20. Application of a hybrid model to reduce bias and improve precision in population estimates for elk (Cervus elaphus) inhabiting a cold desert ecosystem

    USGS Publications Warehouse

    Schoenecker, Kathryn A.; Lubow, Bruce C.

    2016-01-01

    Accurately estimating the size of wildlife populations is critical to wildlife management and conservation of species. Raw counts or “minimum counts” are still used as a basis for wildlife management decisions. Uncorrected raw counts are not only negatively biased due to failure to account for undetected animals, but also provide no estimate of precision on which to judge the utility of counts. We applied a hybrid population estimation technique that combined sightability modeling, radio collar-based mark-resight, and simultaneous double count (double-observer) modeling to estimate the population size of elk in a high elevation desert ecosystem. Combining several models maximizes the strengths of each individual model while minimizing their singular weaknesses. We collected data with aerial helicopter surveys of the elk population in the San Luis Valley and adjacent mountains in Colorado State, USA in 2005 and 2007. We present estimates from 7 alternative analyses: 3 based on different methods for obtaining a raw count and 4 based on different statistical models to correct for sighting probability bias. The most reliable of these approaches is a hybrid double-observer sightability model (model MH), which uses detection patterns of 2 independent observers in a helicopter plus telemetry-based detections of radio collared elk groups. Data were fit to customized mark-resight models with individual sighting covariates. Error estimates were obtained by a bootstrapping procedure. The hybrid method was an improvement over commonly used alternatives, with improved precision compared to sightability modeling and reduced bias compared to double-observer modeling. The resulting population estimate corrected for multiple sources of undercount bias that, if left uncorrected, would have underestimated the true population size by as much as 22.9%. Our comparison of these alternative methods demonstrates how various components of our method contribute to improving the final estimate and demonstrates why each is necessary.

  1. A Sub-filter Scale Noise Equation far Hybrid LES Simulations

    NASA Technical Reports Server (NTRS)

    Goldstein, Marvin E.

    2006-01-01

    Hybrid LES/subscale modeling approaches have an important advantage over the current noise prediction methods in that they only involve modeling of the relatively universal subscale motion and not the configuration dependent larger scale turbulence . Previous hybrid approaches use approximate statistical techniques or extrapolation methods to obtain the requisite information about the sub-filter scale motion. An alternative approach would be to adopt the modeling techniques used in the current noise prediction methods and determine the unknown stresses from experimental data. The present paper derives an equation for predicting the sub scale sound from information that can be obtained with currently available experimental procedures. The resulting prediction method would then be intermediate between the current noise prediction codes and previously proposed hybrid techniques.

  2. Benefit-cost evaluation of an intra-regional air service in the Bay area

    NASA Technical Reports Server (NTRS)

    Haefner, L. E.

    1977-01-01

    Utilization of an iterative statistical model is presented to evaluate combinations of commuter airport sites and surface transportation facilities in confunction with service by a given commuter aircraft type in light of Bay Area regional growth alternatives and peak and off-peak regional travel patterns. The model evaluates such transportation options with respect to criteria of airline profitability, public acceptance, and public and private nonuser costs. It incorporates information modal split, peak and off-peak use of the air commuter fleet, terminal and airport cost, development costs and uses of land in proximity to the airport sites, regional population shifts, and induced zonal shifts in travel demand. The model is multimodal in its analytical capability, and performs exhaustive sensitivity analysis.

  3. Spectral embedding based active contour (SEAC) for lesion segmentation on breast dynamic contrast enhanced magnetic resonance imaging.

    PubMed

    Agner, Shannon C; Xu, Jun; Madabhushi, Anant

    2013-03-01

    Segmentation of breast lesions on dynamic contrast enhanced (DCE) magnetic resonance imaging (MRI) is the first step in lesion diagnosis in a computer-aided diagnosis framework. Because manual segmentation of such lesions is both time consuming and highly susceptible to human error and issues of reproducibility, an automated lesion segmentation method is highly desirable. Traditional automated image segmentation methods such as boundary-based active contour (AC) models require a strong gradient at the lesion boundary. Even when region-based terms are introduced to an AC model, grayscale image intensities often do not allow for clear definition of foreground and background region statistics. Thus, there is a need to find alternative image representations that might provide (1) strong gradients at the margin of the object of interest (OOI); and (2) larger separation between intensity distributions and region statistics for the foreground and background, which are necessary to halt evolution of the AC model upon reaching the border of the OOI. In this paper, the authors introduce a spectral embedding (SE) based AC (SEAC) for lesion segmentation on breast DCE-MRI. SE, a nonlinear dimensionality reduction scheme, is applied to the DCE time series in a voxelwise fashion to reduce several time point images to a single parametric image where every voxel is characterized by the three dominant eigenvectors. This parametric eigenvector image (PrEIm) representation allows for better capture of image region statistics and stronger gradients for use with a hybrid AC model, which is driven by both boundary and region information. They compare SEAC to ACs that employ fuzzy c-means (FCM) and principal component analysis (PCA) as alternative image representations. Segmentation performance was evaluated by boundary and region metrics as well as comparing lesion classification using morphological features from SEAC, PCA+AC, and FCM+AC. On a cohort of 50 breast DCE-MRI studies, PrEIm yielded overall better region and boundary-based statistics compared to the original DCE-MR image, FCM, and PCA based image representations. Additionally, SEAC outperformed a hybrid AC applied to both PCA and FCM image representations. Mean dice similarity coefficient (DSC) for SEAC was significantly better (DSC = 0.74 ± 0.21) than FCM+AC (DSC = 0.50 ± 0.32) and similar to PCA+AC (DSC = 0.73 ± 0.22). Boundary-based metrics of mean absolute difference and Hausdorff distance followed the same trends. Of the automated segmentation methods, breast lesion classification based on morphologic features derived from SEAC segmentation using a support vector machine classifier also performed better (AUC = 0.67 ± 0.05; p < 0.05) than FCM+AC (AUC = 0.50 ± 0.07), and PCA+AC (AUC = 0.49 ± 0.07). In this work, we presented SEAC, an accurate, general purpose AC segmentation tool that could be applied to any imaging domain that employs time series data. SE allows for projection of time series data into a PrEIm representation so that every voxel is characterized by the dominant eigenvectors, capturing the global and local time-intensity curve similarities in the data. This PrEIm allows for the calculation of strong tensor gradients and better region statistics than the original image intensities or alternative image representations such as PCA and FCM. The PrEIm also allows for building a more accurate hybrid AC scheme.

  4. Assessment of NHTSA’s Report “Relationships Between Fatality Risk, Mass, and Footprint in Model Year 2003-2010 Passenger Cars and LTVs”

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wenzel, Tom

    NHTSA recently completed a logistic regression analysis updating its 2003, 2010, and 2012 studies of the relationship between vehicle mass and US fatality risk per vehicle mile traveled (VMT; Kahane 2010, Kahane 2012, Puckett 2016). The new study updates the 2012 analysis using FARS data from 2005 to 2011 for model year 2003 to 2010. Using the updated databases, NHTSA estimates that reducing vehicle mass by 100 pounds while holding footprint fixed would increase fatality risk per VMT by 1.49% for lighter-than-average cars and by 0.50% for heavierthan- average cars, but reduce risk by 0.10% for lighter-than-average light-duty trucks, bymore » 0.71% for heavier-than-average light-duty trucks, and by 0.99% for CUVs/minivans. Using a jack knife method to estimate the statistical uncertainty of these point estimates, NHTSA finds that none of these estimates are statistically significant at the 95% confidence level; however, the 1.49% increase in risk associated with mass reduction in lighter-than-average cars, and the 0.71% and 0.99% decreases in risk associated with mass reduction in heavier-than-average light trucks and CUVs/minivans, are statistically significant at the 90% confidence interval. The effect of mass reduction on risk that NHTSA estimated in 2016 is more beneficial than in its 2012 study, particularly for light trucks and CUVs/minivans. The 2016 NHTSA analysis estimates that reducing vehicle footprint by one square foot while holding mass constant would increase fatality risk per VMT by 0.28% in cars, by 0.38% in light trucks, and by 1.18% in CUVs and minivans.This report replicates the 2016 NHTSA analysis, and reproduces their main results. This report uses the confidence intervals output by the logistic regression models, which are smaller than the intervals NHTSA estimated using a jack-knife technique that accounts for the sampling error in the FARS fatality and state crash data. In addition to reproducing the NHTSA results, this report also examines the NHTSA data in slightly different ways to get a deeper understanding of the relationship between vehicle weight, footprint, and safety. The results of the NHTSA baseline results, and these alternative analyses, are summarized in Table ES.1; statistically significant estimates, based on the confidence intervals output by the logistic regression models, are shown in red in the tables. We found that NHTSA’s reasonable assumption that all vehicles will have ESC installed by 2017 in its baseline regression model slightly increases the estimated increase in risk from mass reduction in cars, but substantially decreases the estimated increase in risk from footprint reduction in all three vehicle types (Alternative 1 in Table ES.1; explained in more detail in Section 2.1 of this report). This is because NHTSA projects ESC to substantially reduce the number of fatalities in rollovers and crashes with stationary objects, and mass reduction appears to reduce risk, while footprint reduction appears to increase risk, in these types of crashes, particularly in cars and CUVs/minivans. A single regression model including all crash types results in slightly different estimates of the relationship between decreasing mass and risk, as shown in Alternative 2 in Table ES.1.« less

  5. Limited data tomographic image reconstruction via dual formulation of total variation minimization

    NASA Astrophysics Data System (ADS)

    Jang, Kwang Eun; Sung, Younghun; Lee, Kangeui; Lee, Jongha; Cho, Seungryong

    2011-03-01

    The X-ray mammography is the primary imaging modality for breast cancer screening. For the dense breast, however, the mammogram is usually difficult to read due to tissue overlap problem caused by the superposition of normal tissues. The digital breast tomosynthesis (DBT) that measures several low dose projections over a limited angle range may be an alternative modality for breast imaging, since it allows the visualization of the cross-sectional information of breast. The DBT, however, may suffer from the aliasing artifact and the severe noise corruption. To overcome these problems, a total variation (TV) regularized statistical reconstruction algorithm is presented. Inspired by the dual formulation of TV minimization in denoising and deblurring problems, we derived a gradient-type algorithm based on statistical model of X-ray tomography. The objective function is comprised of a data fidelity term derived from the statistical model and a TV regularization term. The gradient of the objective function can be easily calculated using simple operations in terms of auxiliary variables. After a descending step, the data fidelity term is renewed in each iteration. Since the proposed algorithm can be implemented without sophisticated operations such as matrix inverse, it provides an efficient way to include the TV regularization in the statistical reconstruction method, which results in a fast and robust estimation for low dose projections over the limited angle range. Initial tests with an experimental DBT system confirmed our finding.

  6. Fast maximum likelihood estimation using continuous-time neural point process models.

    PubMed

    Lepage, Kyle Q; MacDonald, Christopher J

    2015-06-01

    A recent report estimates that the number of simultaneously recorded neurons is growing exponentially. A commonly employed statistical paradigm using discrete-time point process models of neural activity involves the computation of a maximum-likelihood estimate. The time to computate this estimate, per neuron, is proportional to the number of bins in a finely spaced discretization of time. By using continuous-time models of neural activity and the optimally efficient Gaussian quadrature, memory requirements and computation times are dramatically decreased in the commonly encountered situation where the number of parameters p is much less than the number of time-bins n. In this regime, with q equal to the quadrature order, memory requirements are decreased from O(np) to O(qp), and the number of floating-point operations are decreased from O(np(2)) to O(qp(2)). Accuracy of the proposed estimates is assessed based upon physiological consideration, error bounds, and mathematical results describing the relation between numerical integration error and numerical error affecting both parameter estimates and the observed Fisher information. A check is provided which is used to adapt the order of numerical integration. The procedure is verified in simulation and for hippocampal recordings. It is found that in 95 % of hippocampal recordings a q of 60 yields numerical error negligible with respect to parameter estimate standard error. Statistical inference using the proposed methodology is a fast and convenient alternative to statistical inference performed using a discrete-time point process model of neural activity. It enables the employment of the statistical methodology available with discrete-time inference, but is faster, uses less memory, and avoids any error due to discretization.

  7. An introduction to using Bayesian linear regression with clinical data.

    PubMed

    Baldwin, Scott A; Larson, Michael J

    2017-11-01

    Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Improving Education in Medical Statistics: Implementing a Blended Learning Model in the Existing Curriculum

    PubMed Central

    Milic, Natasa M.; Trajkovic, Goran Z.; Bukumiric, Zoran M.; Cirkovic, Andja; Nikolic, Ivan M.; Milin, Jelena S.; Milic, Nikola V.; Savic, Marko D.; Corac, Aleksandar M.; Marinkovic, Jelena M.; Stanisavljevic, Dejana M.

    2016-01-01

    Background Although recent studies report on the benefits of blended learning in improving medical student education, there is still no empirical evidence on the relative effectiveness of blended over traditional learning approaches in medical statistics. We implemented blended along with on-site (i.e. face-to-face) learning to further assess the potential value of web-based learning in medical statistics. Methods This was a prospective study conducted with third year medical undergraduate students attending the Faculty of Medicine, University of Belgrade, who passed (440 of 545) the final exam of the obligatory introductory statistics course during 2013–14. Student statistics achievements were stratified based on the two methods of education delivery: blended learning and on-site learning. Blended learning included a combination of face-to-face and distance learning methodologies integrated into a single course. Results Mean exam scores for the blended learning student group were higher than for the on-site student group for both final statistics score (89.36±6.60 vs. 86.06±8.48; p = 0.001) and knowledge test score (7.88±1.30 vs. 7.51±1.36; p = 0.023) with a medium effect size. There were no differences in sex or study duration between the groups. Current grade point average (GPA) was higher in the blended group. In a multivariable regression model, current GPA and knowledge test scores were associated with the final statistics score after adjusting for study duration and learning modality (p<0.001). Conclusion This study provides empirical evidence to support educator decisions to implement different learning environments for teaching medical statistics to undergraduate medical students. Blended and on-site training formats led to similar knowledge acquisition; however, students with higher GPA preferred the technology assisted learning format. Implementation of blended learning approaches can be considered an attractive, cost-effective, and efficient alternative to traditional classroom training in medical statistics. PMID:26859832

  9. Improving Education in Medical Statistics: Implementing a Blended Learning Model in the Existing Curriculum.

    PubMed

    Milic, Natasa M; Trajkovic, Goran Z; Bukumiric, Zoran M; Cirkovic, Andja; Nikolic, Ivan M; Milin, Jelena S; Milic, Nikola V; Savic, Marko D; Corac, Aleksandar M; Marinkovic, Jelena M; Stanisavljevic, Dejana M

    2016-01-01

    Although recent studies report on the benefits of blended learning in improving medical student education, there is still no empirical evidence on the relative effectiveness of blended over traditional learning approaches in medical statistics. We implemented blended along with on-site (i.e. face-to-face) learning to further assess the potential value of web-based learning in medical statistics. This was a prospective study conducted with third year medical undergraduate students attending the Faculty of Medicine, University of Belgrade, who passed (440 of 545) the final exam of the obligatory introductory statistics course during 2013-14. Student statistics achievements were stratified based on the two methods of education delivery: blended learning and on-site learning. Blended learning included a combination of face-to-face and distance learning methodologies integrated into a single course. Mean exam scores for the blended learning student group were higher than for the on-site student group for both final statistics score (89.36±6.60 vs. 86.06±8.48; p = 0.001) and knowledge test score (7.88±1.30 vs. 7.51±1.36; p = 0.023) with a medium effect size. There were no differences in sex or study duration between the groups. Current grade point average (GPA) was higher in the blended group. In a multivariable regression model, current GPA and knowledge test scores were associated with the final statistics score after adjusting for study duration and learning modality (p<0.001). This study provides empirical evidence to support educator decisions to implement different learning environments for teaching medical statistics to undergraduate medical students. Blended and on-site training formats led to similar knowledge acquisition; however, students with higher GPA preferred the technology assisted learning format. Implementation of blended learning approaches can be considered an attractive, cost-effective, and efficient alternative to traditional classroom training in medical statistics.

  10. Isotherm ranking and selection using thirteen literature datasets involving hydrophobic organic compounds.

    PubMed

    Matott, L Shawn; Jiang, Zhengzheng; Rabideau, Alan J; Allen-King, Richelle M

    2015-01-01

    Numerous isotherm expressions have been developed for describing sorption of hydrophobic organic compounds (HOCs), including "dual-mode" approaches that combine nonlinear behavior with a linear partitioning component. Choosing among these alternative expressions for describing a given dataset is an important task that can significantly influence subsequent transport modeling and/or mechanistic interpretation. In this study, a series of numerical experiments were undertaken to identify "best-in-class" isotherms by refitting 10 alternative models to a suite of 13 previously published literature datasets. The corrected Akaike Information Criterion (AICc) was used for ranking these alternative fits and distinguishing between plausible and implausible isotherms for each dataset. The occurrence of multiple plausible isotherms was inversely correlated with dataset "richness", such that datasets with fewer observations and/or a narrow range of aqueous concentrations resulted in a greater number of plausible isotherms. Overall, only the Polanyi-partition dual-mode isotherm was classified as "plausible" across all 13 of the considered datasets, indicating substantial statistical support consistent with current advances in sorption theory. However, these findings are predicated on the use of the AICc measure as an unbiased ranking metric and the adoption of a subjective, but defensible, threshold for separating plausible and implausible isotherms. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Efficient Generation and Selection of Virtual Populations in Quantitative Systems Pharmacology Models

    PubMed Central

    Rieger, TR; Musante, CJ

    2016-01-01

    Quantitative systems pharmacology models mechanistically describe a biological system and the effect of drug treatment on system behavior. Because these models rarely are identifiable from the available data, the uncertainty in physiological parameters may be sampled to create alternative parameterizations of the model, sometimes termed “virtual patients.” In order to reproduce the statistics of a clinical population, virtual patients are often weighted to form a virtual population that reflects the baseline characteristics of the clinical cohort. Here we introduce a novel technique to efficiently generate virtual patients and, from this ensemble, demonstrate how to select a virtual population that matches the observed data without the need for weighting. This approach improves confidence in model predictions by mitigating the risk that spurious virtual patients become overrepresented in virtual populations. PMID:27069777

  12. Characterizing uncertainty and variability in physiologically based pharmacokinetic models: state of the science and needs for research and implementation.

    PubMed

    Barton, Hugh A; Chiu, Weihsueh A; Setzer, R Woodrow; Andersen, Melvin E; Bailer, A John; Bois, Frédéric Y; Dewoskin, Robert S; Hays, Sean; Johanson, Gunnar; Jones, Nancy; Loizou, George; Macphail, Robert C; Portier, Christopher J; Spendiff, Martin; Tan, Yu-Mei

    2007-10-01

    Physiologically based pharmacokinetic (PBPK) models are used in mode-of-action based risk and safety assessments to estimate internal dosimetry in animals and humans. When used in risk assessment, these models can provide a basis for extrapolating between species, doses, and exposure routes or for justifying nondefault values for uncertainty factors. Characterization of uncertainty and variability is increasingly recognized as important for risk assessment; this represents a continuing challenge for both PBPK modelers and users. Current practices show significant progress in specifying deterministic biological models and nondeterministic (often statistical) models, estimating parameters using diverse data sets from multiple sources, using them to make predictions, and characterizing uncertainty and variability of model parameters and predictions. The International Workshop on Uncertainty and Variability in PBPK Models, held 31 Oct-2 Nov 2006, identified the state-of-the-science, needed changes in practice and implementation, and research priorities. For the short term, these include (1) multidisciplinary teams to integrate deterministic and nondeterministic/statistical models; (2) broader use of sensitivity analyses, including for structural and global (rather than local) parameter changes; and (3) enhanced transparency and reproducibility through improved documentation of model structure(s), parameter values, sensitivity and other analyses, and supporting, discrepant, or excluded data. Longer-term needs include (1) theoretical and practical methodological improvements for nondeterministic/statistical modeling; (2) better methods for evaluating alternative model structures; (3) peer-reviewed databases of parameters and covariates, and their distributions; (4) expanded coverage of PBPK models across chemicals with different properties; and (5) training and reference materials, such as cases studies, bibliographies/glossaries, model repositories, and enhanced software. The multidisciplinary dialogue initiated by this Workshop will foster the collaboration, research, data collection, and training necessary to make characterizing uncertainty and variability a standard practice in PBPK modeling and risk assessment.

  13. Insight into model mechanisms through automatic parameter fitting: a new methodological framework for model development

    PubMed Central

    2014-01-01

    Background Striking a balance between the degree of model complexity and parameter identifiability, while still producing biologically feasible simulations using modelling is a major challenge in computational biology. While these two elements of model development are closely coupled, parameter fitting from measured data and analysis of model mechanisms have traditionally been performed separately and sequentially. This process produces potential mismatches between model and data complexities that can compromise the ability of computational frameworks to reveal mechanistic insights or predict new behaviour. In this study we address this issue by presenting a generic framework for combined model parameterisation, comparison of model alternatives and analysis of model mechanisms. Results The presented methodology is based on a combination of multivariate metamodelling (statistical approximation of the input–output relationships of deterministic models) and a systematic zooming into biologically feasible regions of the parameter space by iterative generation of new experimental designs and look-up of simulations in the proximity of the measured data. The parameter fitting pipeline includes an implicit sensitivity analysis and analysis of parameter identifiability, making it suitable for testing hypotheses for model reduction. Using this approach, under-constrained model parameters, as well as the coupling between parameters within the model are identified. The methodology is demonstrated by refitting the parameters of a published model of cardiac cellular mechanics using a combination of measured data and synthetic data from an alternative model of the same system. Using this approach, reduced models with simplified expressions for the tropomyosin/crossbridge kinetics were found by identification of model components that can be omitted without affecting the fit to the parameterising data. Our analysis revealed that model parameters could be constrained to a standard deviation of on average 15% of the mean values over the succeeding parameter sets. Conclusions Our results indicate that the presented approach is effective for comparing model alternatives and reducing models to the minimum complexity replicating measured data. We therefore believe that this approach has significant potential for reparameterising existing frameworks, for identification of redundant model components of large biophysical models and to increase their predictive capacity. PMID:24886522

  14. In-Use Emissions and Estimated Impacts of Traditional, Natural- and Forced-Draft Cookstoves in Rural Malawi

    PubMed Central

    2017-01-01

    Emissions from traditional cooking practices in low- and middle-income countries have detrimental health and climate effects; cleaner-burning cookstoves may provide “co-benefits”. Here we assess this potential via in-home measurements of fuel-use and emissions and real-time optical properties of pollutants from traditional and alternative cookstoves in rural Malawi. Alternative cookstove models were distributed by existing initiatives and include a low-cost ceramic model, two forced-draft cookstoves (FDCS; Philips HD4012LS and ACE-1), and three institutional cookstoves. Among household cookstoves, emission factors (EF; g (kg wood)−1) were lowest for the Philips, with statistically significant reductions relative to baseline of 45% and 47% for fine particulate matter (PM2.5) and carbon monoxide (CO), respectively. The Philips was the only cookstove tested that showed significant reductions in elemental carbon (EC) emission rate. Estimated health and climate cobenefits of alternative cookstoves were smaller than predicted from laboratory tests due to the effects of real-world conditions including fuel variability and nonideal operation. For example, estimated daily PM intake and field-measurement-based global warming commitment (GWC) for the Philips FDCS were a factor of 8.6 and 2.8 times higher, respectively, than those based on lab measurements. In-field measurements provide an assessment of alternative cookstoves under real-world conditions and as such likely provide more realistic estimates of their potential health and climate benefits than laboratory tests. PMID:28060518

  15. National Forum on Education Statistics History

    ERIC Educational Resources Information Center

    Hoffman, Lee

    2004-01-01

    The first task force meeting, co-organized by the Center for Education Statistics and CCSSO, was convened in Alexandria, Virginia, on March 13?15, 1988. The purpose of this meeting was to explore alternative strategies for a cooperative federal-state education statistics program that would be broad in scope, encompassing the Common Core of Data…

  16. Introducing Statistical Inference to Biology Students through Bootstrapping and Randomization

    ERIC Educational Resources Information Center

    Lock, Robin H.; Lock, Patti Frazer

    2008-01-01

    Bootstrap methods and randomization tests are increasingly being used as alternatives to standard statistical procedures in biology. They also serve as an effective introduction to the key ideas of statistical inference in introductory courses for biology students. We discuss the use of such simulation based procedures in an integrated curriculum…

  17. A new statistic for the analysis of circular data in gamma-ray astronomy

    NASA Technical Reports Server (NTRS)

    Protheroe, R. J.

    1985-01-01

    A new statistic is proposed for the analysis of circular data. The statistic is designed specifically for situations where a test of uniformity is required which is powerful against alternatives in which a small fraction of the observations is grouped in a small range of directions, or phases.

  18. An Integrated, Statistical Molecular Approach to the Physical Chemistry Curriculum

    ERIC Educational Resources Information Center

    Cartier, Stephen F.

    2009-01-01

    As an alternative to the "thermodynamics first" or "quantum first" approaches to the physical chemistry curriculum, the statistical definition of entropy and the Boltzmann distribution are introduced in the first days of the course and the entire two-semester curriculum is then developed from these concepts. Once the tools of statistical mechanics…

  19. NASA thesaurus combined file postings statistics

    NASA Technical Reports Server (NTRS)

    1993-01-01

    The NASA Thesaurus Combined File Postings Statistics is published semiannually (January and July). This alphabetical listing of postable subject terms contained in the NASA Thesaurus is used to display the number of postings (documents) indexed by each subject term from 1968 to date. The postings totals per item are separated by announcement of other media into STAR, IAA, COSMIC, and OTHER, columnar entries covering the NASA document collection (1968 to date). This is a cumulative publication, and except for special cases, no reference is needed to previous issuances. Retention of the January 1992 issue could be helpful for book information. With the July 1992 issue, NALNET book statistics have been replaced by COSMIC statistics for NASA funded software. File postings statistics for the Alternate Data Base covering NASA collection from 1962 through 1967 were published on a one-time basis in September 1975. Subject terms for the Alternate Data Base are derived from the subject Authority List, reprinted 1985, which is available upon request. The distribution of 19,697,748 postings among the 17,446 NASA Thesaurus terms is tabulated on the last page of the NASA Thesaurus Combined File Postings Statistics.

  20. The Hanford Thyroid Disease Study: an alternative view of the findings.

    PubMed

    Hoffman, F Owen; Ruttenber, A James; Apostoaei, A Iulian; Carroll, Raymond J; Greenland, Sander

    2007-02-01

    The Hanford Thyroid Disease Study (HTDS) is one of the largest and most complex epidemiologic studies of the relation between environmental exposures to I and thyroid disease. The study detected no dose-response relation using a 0.05 level for statistical significance. The results for thyroid cancer appear inconsistent with those from other studies of populations with similar exposures, and either reflect inadequate statistical power, bias, or unique relations between exposure and disease risk. In this paper, we explore these possibilities, and present evidence that the HTDS statistical power was inadequate due to complex uncertainties associated with the mathematical models and assumptions used to reconstruct individual doses. We conclude that, at the very least, the confidence intervals reported by the HTDS for thyroid cancer and other thyroid diseases are too narrow because they fail to reflect key uncertainties in the measurement-error structure. We recommend that the HTDS results be interpreted as inconclusive rather than as evidence for little or no disease risk from Hanford exposures.

  1. scoringRules - A software package for probabilistic model evaluation

    NASA Astrophysics Data System (ADS)

    Lerch, Sebastian; Jordan, Alexander; Krüger, Fabian

    2016-04-01

    Models in the geosciences are generally surrounded by uncertainty, and being able to quantify this uncertainty is key to good decision making. Accordingly, probabilistic forecasts in the form of predictive distributions have become popular over the last decades. With the proliferation of probabilistic models arises the need for decision theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way. Various scoring rules have been developed over the past decades to address this demand. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. As such, they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This poster presents the software package scoringRules for the statistical programming language R, which contains functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. Two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, Bayesian forecasts produced via Markov Chain Monte Carlo take this form. Thereby, the scoringRules package provides a framework for generalized model evaluation that both includes Bayesian as well as classical parametric models. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kacprzak, T.; Kirk, D.; Friedrich, O.

    Shear peak statistics has gained a lot of attention recently as a practical alternative to the two point statistics for constraining cosmological parameters. We perform a shear peak statistics analysis of the Dark Energy Survey (DES) Science Verification (SV) data, using weak gravitational lensing measurements from a 139 degmore » $^2$ field. We measure the abundance of peaks identified in aperture mass maps, as a function of their signal-to-noise ratio, in the signal-to-noise range $$0<\\mathcal S / \\mathcal N<4$$. To predict the peak counts as a function of cosmological parameters we use a suite of $N$-body simulations spanning 158 models with varying $$\\Omega_{\\rm m}$$ and $$\\sigma_8$$, fixing $w = -1$, $$\\Omega_{\\rm b} = 0.04$$, $h = 0.7$ and $$n_s=1$$, to which we have applied the DES SV mask and redshift distribution. In our fiducial analysis we measure $$\\sigma_{8}(\\Omega_{\\rm m}/0.3)^{0.6}=0.77 \\pm 0.07$$, after marginalising over the shear multiplicative bias and the error on the mean redshift of the galaxy sample. We introduce models of intrinsic alignments, blending, and source contamination by cluster members. These models indicate that peaks with $$\\mathcal S / \\mathcal N>4$$ would require significant corrections, which is why we do not include them in our analysis. We compare our results to the cosmological constraints from the two point analysis on the SV field and find them to be in good agreement in both the central value and its uncertainty. As a result, we discuss prospects for future peak statistics analysis with upcoming DES data.« less

  3. Meta‐analysis using individual participant data: one‐stage and two‐stage approaches, and why they may differ

    PubMed Central

    Ensor, Joie; Riley, Richard D.

    2016-01-01

    Meta‐analysis using individual participant data (IPD) obtains and synthesises the raw, participant‐level data from a set of relevant studies. The IPD approach is becoming an increasingly popular tool as an alternative to traditional aggregate data meta‐analysis, especially as it avoids reliance on published results and provides an opportunity to investigate individual‐level interactions, such as treatment‐effect modifiers. There are two statistical approaches for conducting an IPD meta‐analysis: one‐stage and two‐stage. The one‐stage approach analyses the IPD from all studies simultaneously, for example, in a hierarchical regression model with random effects. The two‐stage approach derives aggregate data (such as effect estimates) in each study separately and then combines these in a traditional meta‐analysis model. There have been numerous comparisons of the one‐stage and two‐stage approaches via theoretical consideration, simulation and empirical examples, yet there remains confusion regarding when each approach should be adopted, and indeed why they may differ. In this tutorial paper, we outline the key statistical methods for one‐stage and two‐stage IPD meta‐analyses, and provide 10 key reasons why they may produce different summary results. We explain that most differences arise because of different modelling assumptions, rather than the choice of one‐stage or two‐stage itself. We illustrate the concepts with recently published IPD meta‐analyses, summarise key statistical software and provide recommendations for future IPD meta‐analyses. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27747915

  4. A novel generalized normal distribution for human longevity and other negatively skewed data.

    PubMed

    Robertson, Henry T; Allison, David B

    2012-01-01

    Negatively skewed data arise occasionally in statistical practice; perhaps the most familiar example is the distribution of human longevity. Although other generalizations of the normal distribution exist, we demonstrate a new alternative that apparently fits human longevity data better. We propose an alternative approach of a normal distribution whose scale parameter is conditioned on attained age. This approach is consistent with previous findings that longevity conditioned on survival to the modal age behaves like a normal distribution. We derive such a distribution and demonstrate its accuracy in modeling human longevity data from life tables. The new distribution is characterized by 1. An intuitively straightforward genesis; 2. Closed forms for the pdf, cdf, mode, quantile, and hazard functions; and 3. Accessibility to non-statisticians, based on its close relationship to the normal distribution.

  5. A Novel Generalized Normal Distribution for Human Longevity and other Negatively Skewed Data

    PubMed Central

    Robertson, Henry T.; Allison, David B.

    2012-01-01

    Negatively skewed data arise occasionally in statistical practice; perhaps the most familiar example is the distribution of human longevity. Although other generalizations of the normal distribution exist, we demonstrate a new alternative that apparently fits human longevity data better. We propose an alternative approach of a normal distribution whose scale parameter is conditioned on attained age. This approach is consistent with previous findings that longevity conditioned on survival to the modal age behaves like a normal distribution. We derive such a distribution and demonstrate its accuracy in modeling human longevity data from life tables. The new distribution is characterized by 1. An intuitively straightforward genesis; 2. Closed forms for the pdf, cdf, mode, quantile, and hazard functions; and 3. Accessibility to non-statisticians, based on its close relationship to the normal distribution. PMID:22623974

  6. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm.

    PubMed

    Raykov, Yordan P; Boukouvalas, Alexis; Baig, Fahd; Little, Max A

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.

  7. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm

    PubMed Central

    Baig, Fahd; Little, Max A.

    2016-01-01

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism. PMID:27669525

  8. Modeling time-to-event (survival) data using classification tree analysis.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2017-12-01

    Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.

  9. Prediction of Hydrologic Characteristics for Ungauged Catchments to Support Hydroecological Modeling

    NASA Astrophysics Data System (ADS)

    Bond, Nick R.; Kennard, Mark J.

    2017-11-01

    Hydrologic variability is a fundamental driver of ecological processes and species distribution patterns within river systems, yet the paucity of gauges in many catchments means that streamflow data are often unavailable for ecological survey sites. Filling this data gap is an important challenge in hydroecological research. To address this gap, we first test the ability to spatially extrapolate hydrologic metrics calculated from gauged streamflow data to ungauged sites as a function of stream distance and catchment area. Second, we examine the ability of statistical models to predict flow regime metrics based on climate and catchment physiographic variables. Our assessment focused on Australia's largest catchment, the Murray-Darling Basin (MDB). We found that hydrologic metrics were predictable only between sites within ˜25 km of one another. Beyond this, correlations between sites declined quickly. We found less than 40% of fish survey sites from a recent basin-wide monitoring program (n = 777 sites) to fall within this 25 km range, thereby greatly limiting the ability to utilize gauge data for direct spatial transposition of hydrologic metrics to biological survey sites. In contrast, statistical model-based transposition proved effective in predicting ecologically relevant aspects of the flow regime (including metrics describing central tendency, high- and low-flows intermittency, seasonality, and variability) across the entire gauge network (median R2 ˜ 0.54, range 0.39-0.94). Modeled hydrologic metrics thus offer a useful alternative to empirical data when examining biological survey data from ungauged sites. More widespread use of these statistical tools and modeled metrics could expand our understanding of flow-ecology relationships.

  10. How I Teach the Second Law of Thermodynamics

    ERIC Educational Resources Information Center

    Kincanon, Eric

    2013-01-01

    An alternative method of presenting the second law of thermodynamics in introductory courses is presented. The emphasis is on statistical approaches as developed by Atkins. This has the benefit of stressing the statistical nature of the law.

  11. An analysis of urban collisions using an artificial intelligence model.

    PubMed

    Mussone, L; Ferrari, A; Oneta, M

    1999-11-01

    Traditional studies on road accidents estimate the effect of variables (such as vehicular flows, road geometry, vehicular characteristics), and the calculation of the number of accidents. A descriptive statistical analysis of the accidents (those used in the model) over the period 1992-1995 is proposed. The paper describes an alternative method based on the use of artificial neural networks (ANN) in order to work out a model that relates to the analysis of vehicular accidents in Milan. The degree of danger of urban intersections using different scenarios is quantified by the ANN model. Methodology is the first result, which allows us to tackle the modelling of urban vehicular accidents by the innovative use of ANN. Other results deal with model outputs: intersection complexity may determine a higher accident index depending on the regulation of intersection. The highest index for running over of pedestrian occurs at non-signalised intersections at night-time.

  12. Model selection for multi-component frailty models.

    PubMed

    Ha, Il Do; Lee, Youngjo; MacKenzie, Gilbert

    2007-11-20

    Various frailty models have been developed and are now widely used for analysing multivariate survival data. It is therefore important to develop an information criterion for model selection. However, in frailty models there are several alternative ways of forming a criterion and the particular criterion chosen may not be uniformly best. In this paper, we study an Akaike information criterion (AIC) on selecting a frailty structure from a set of (possibly) non-nested frailty models. We propose two new AIC criteria, based on a conditional likelihood and an extended restricted likelihood (ERL) given by Lee and Nelder (J. R. Statist. Soc. B 1996; 58:619-678). We compare their performance using well-known practical examples and demonstrate that the two criteria may yield rather different results. A simulation study shows that the AIC based on the ERL is recommended, when attention is focussed on selecting the frailty structure rather than the fixed effects.

  13. Model selection and parameter estimation in structural dynamics using approximate Bayesian computation

    NASA Astrophysics Data System (ADS)

    Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith

    2018-01-01

    This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.

  14. Alternative Schools and Programs for Public School Students at Risk of Educational Failure: 2007-08. First Look. NCES 2010-026

    ERIC Educational Resources Information Center

    Carver, Priscilla Rouse; Lewis, Laurie; Tice, Peter

    2010-01-01

    This report provides national estimates on the availability of alternative schools and programs for students at risk of educational failure in public school districts during the 2007-08 school year. The National Center for Education Statistics (NCES) previously reported results from a similar survey of alternative schools and programs conducted…

  15. Alternative Derivations of the Statistical Mechanical Distribution Laws

    PubMed Central

    Wall, Frederick T.

    1971-01-01

    A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems. PMID:16578712

  16. Alternative derivations of the statistical mechanical distribution laws.

    PubMed

    Wall, F T

    1971-08-01

    A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems.

  17. Meta-analysis of diagnostic accuracy studies accounting for disease prevalence: alternative parameterizations and model selection.

    PubMed

    Chu, Haitao; Nie, Lei; Cole, Stephen R; Poole, Charles

    2009-08-15

    In a meta-analysis of diagnostic accuracy studies, the sensitivities and specificities of a diagnostic test may depend on the disease prevalence since the severity and definition of disease may differ from study to study due to the design and the population considered. In this paper, we extend the bivariate nonlinear random effects model on sensitivities and specificities to jointly model the disease prevalence, sensitivities and specificities using trivariate nonlinear random-effects models. Furthermore, as an alternative parameterization, we also propose jointly modeling the test prevalence and the predictive values, which reflect the clinical utility of a diagnostic test. These models allow investigators to study the complex relationship among the disease prevalence, sensitivities and specificities; or among test prevalence and the predictive values, which can reveal hidden information about test performance. We illustrate the proposed two approaches by reanalyzing the data from a meta-analysis of radiological evaluation of lymph node metastases in patients with cervical cancer and a simulation study. The latter illustrates the importance of carefully choosing an appropriate normality assumption for the disease prevalence, sensitivities and specificities, or the test prevalence and the predictive values. In practice, it is recommended to use model selection techniques to identify a best-fitting model for making statistical inference. In summary, the proposed trivariate random effects models are novel and can be very useful in practice for meta-analysis of diagnostic accuracy studies. Copyright 2009 John Wiley & Sons, Ltd.

  18. 8760-Based Method for Representing Variable Generation Capacity Value in Capacity Expansion Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frew, Bethany A

    Capacity expansion models (CEMs) are widely used to evaluate the least-cost portfolio of electricity generators, transmission, and storage needed to reliably serve load over many years or decades. CEMs can be computationally complex and are often forced to estimate key parameters using simplified methods to achieve acceptable solve times or for other reasons. In this paper, we discuss one of these parameters -- capacity value (CV). We first provide a high-level motivation for and overview of CV. We next describe existing modeling simplifications and an alternate approach for estimating CV that utilizes hourly '8760' data of load and VG resources.more » We then apply this 8760 method to an established CEM, the National Renewable Energy Laboratory's (NREL's) Regional Energy Deployment System (ReEDS) model (Eurek et al. 2016). While this alternative approach for CV is not itself novel, it contributes to the broader CEM community by (1) demonstrating how a simplified 8760 hourly method, which can be easily implemented in other power sector models when data is available, more accurately captures CV trends than a statistical method within the ReEDS CEM, and (2) providing a flexible modeling framework from which other 8760-based system elements (e.g., demand response, storage, and transmission) can be added to further capture important dynamic interactions, such as curtailment.« less

  19. Statistical tests of simple earthquake cycle models

    NASA Astrophysics Data System (ADS)

    DeVries, Phoebe M. R.; Evans, Eileen L.

    2016-12-01

    A central goal of observing and modeling the earthquake cycle is to forecast when a particular fault may generate an earthquake: a fault late in its earthquake cycle may be more likely to generate an earthquake than a fault early in its earthquake cycle. Models that can explain geodetic observations throughout the entire earthquake cycle may be required to gain a more complete understanding of relevant physics and phenomenology. Previous efforts to develop unified earthquake models for strike-slip faults have largely focused on explaining both preseismic and postseismic geodetic observations available across a few faults in California, Turkey, and Tibet. An alternative approach leverages the global distribution of geodetic and geologic slip rate estimates on strike-slip faults worldwide. Here we use the Kolmogorov-Smirnov test for similarity of distributions to infer, in a statistically rigorous manner, viscoelastic earthquake cycle models that are inconsistent with 15 sets of observations across major strike-slip faults. We reject a large subset of two-layer models incorporating Burgers rheologies at a significance level of α = 0.05 (those with long-term Maxwell viscosities ηM < 4.0 × 1019 Pa s and ηM > 4.6 × 1020 Pa s) but cannot reject models on the basis of transient Kelvin viscosity ηK. Finally, we examine the implications of these results for the predicted earthquake cycle timing of the 15 faults considered and compare these predictions to the geologic and historical record.

  20. Ranking Theory and Conditional Reasoning.

    PubMed

    Skovgaard-Olsen, Niels

    2016-05-01

    Ranking theory is a formal epistemology that has been developed in over 600 pages in Spohn's recent book The Laws of Belief, which aims to provide a normative account of the dynamics of beliefs that presents an alternative to current probabilistic approaches. It has long been received in the AI community, but it has not yet found application in experimental psychology. The purpose of this paper is to derive clear, quantitative predictions by exploiting a parallel between ranking theory and a statistical model called logistic regression. This approach is illustrated by the development of a model for the conditional inference task using Spohn's (2013) ranking theoretic approach to conditionals. Copyright © 2015 Cognitive Science Society, Inc.

  1. Model-based clustering for RNA-seq data.

    PubMed

    Si, Yaqing; Liu, Peng; Li, Pinghua; Brutnell, Thomas P

    2014-01-15

    RNA-seq technology has been widely adopted as an attractive alternative to microarray-based methods to study global gene expression. However, robust statistical tools to analyze these complex datasets are still lacking. By grouping genes with similar expression profiles across treatments, cluster analysis provides insight into gene functions and networks, and hence is an important technique for RNA-seq data analysis. In this manuscript, we derive clustering algorithms based on appropriate probability models for RNA-seq data. An expectation-maximization algorithm and another two stochastic versions of expectation-maximization algorithms are described. In addition, a strategy for initialization based on likelihood is proposed to improve the clustering algorithms. Moreover, we present a model-based hybrid-hierarchical clustering method to generate a tree structure that allows visualization of relationships among clusters as well as flexibility of choosing the number of clusters. Results from both simulation studies and analysis of a maize RNA-seq dataset show that our proposed methods provide better clustering results than alternative methods such as the K-means algorithm and hierarchical clustering methods that are not based on probability models. An R package, MBCluster.Seq, has been developed to implement our proposed algorithms. This R package provides fast computation and is publicly available at http://www.r-project.org

  2. Sunspot random walk and 22-year variation

    USGS Publications Warehouse

    Love, Jeffrey J.; Rigler, E. Joshua

    2012-01-01

    We examine two stochastic models for consistency with observed long-term secular trends in sunspot number and a faint, but semi-persistent, 22-yr signal: (1) a null hypothesis, a simple one-parameter random-walk model of sunspot-number cycle-to-cycle change, and, (2) an alternative hypothesis, a two-parameter random-walk model with an imposed 22-yr alternating amplitude. The observed secular trend in sunspots, seen from solar cycle 5 to 23, would not be an unlikely result of the accumulation of multiple random-walk steps. Statistical tests show that a 22-yr signal can be resolved in historical sunspot data; that is, the probability is low that it would be realized from random data. On the other hand, the 22-yr signal has a small amplitude compared to random variation, and so it has a relatively small effect on sunspot predictions. Many published predictions for cycle 24 sunspots fall within the dispersion of previous cycle-to-cycle sunspot differences. The probability is low that the Sun will, with the accumulation of random steps over the next few cycles, walk down to a Dalton-like minimum. Our models support published interpretations of sunspot secular variation and 22-yr variation resulting from cycle-to-cycle accumulation of dynamo-generated magnetic energy.

  3. Persons with allergy symptoms use alternative medicine more often.

    PubMed

    Kłak, Anna; Raciborski, Filip; Krzych-Fałta, Edyta; Opoczyńska-Świeżewska, Dagmara; Szymański, Jakub; Lipiec, Agnieszka; Piekarska, Barbara; Sybilski, Adam; Tomaszewska, Aneta; Samoliński, Bolesław

    2016-01-01

    The aim of the study is to indicate the relation between the use of alternative medicine and the occurrence of allergic diseases in the Polish population of adults in the age of 20-44 years. Moreover the additional aim of the study is to define the relation between the sex, age and place of living and the use of alternative medicine. The data from the project Epidemiology of Allergic Diseases in Poland (ECAP) has been used for analysis. This project was a continuation of the European Community Respiratory Health Survey II. The questions on alternative medicine were asked to the group of 4671 respondents in the age of 20-44 years. Additionally outpatient tests were performed in order to confirm the diagnosis of allergic diseases. The total of 22.2% of respondents that participated in the study have ever used alternative medicine (n = 4621). A statistically significant relation between the use of alternative medicine and declaration of allergic diseases and asthma symptoms has been demonstrated (p &amp;amp;amp;amp;amp;lt; 0.001). No statistically significant relation between the use of alternative medicine by persons diagnosed by a doctor with any form of asthma or seasonal allergic rhinitis (p &amp;amp;amp;amp;amp;gt; 0.05) has been demonstrated. The occurrence of allergic diseases and asthma influences the frequency of alternative medicine use. However the frequency of alternative medicine use does not depend on allergic disease or asthma being confirmed by a doctor.

  4. A comparison of methods of fitting several models to nutritional response data.

    PubMed

    Vedenov, D; Pesti, G M

    2008-02-01

    A variety of models have been proposed to fit nutritional input-output response data. The models are typically nonlinear; therefore, fitting the models usually requires sophisticated statistical software and training to use it. An alternative tool for fitting nutritional response models was developed by using widely available and easier-to-use Microsoft Excel software. The tool, implemented as an Excel workbook (NRM.xls), allows simultaneous fitting and side-by-side comparisons of several popular models. This study compared the results produced by the tool we developed and PROC NLIN of SAS. The models compared were the broken line (ascending linear and quadratic segments), saturation kinetics, 4-parameter logistics, sigmoidal, and exponential models. The NRM.xls workbook provided results nearly identical to those of PROC NLIN. Furthermore, the workbook successfully fit several models that failed to converge in PROC NLIN. Two data sets were used as examples to compare fits by the different models. The results suggest that no particular nonlinear model is necessarily best for all nutritional response data.

  5. The effects of sampling frequency on the climate statistics of the European Centre for Medium-Range Weather Forecasts

    NASA Astrophysics Data System (ADS)

    Phillips, Thomas J.; Gates, W. Lawrence; Arpe, Klaus

    1992-12-01

    The effects of sampling frequency on the first- and second-moment statistics of selected European Centre for Medium-Range Weather Forecasts (ECMWF) model variables are investigated in a simulation of "perpetual July" with a diurnal cycle included and with surface and atmospheric fields saved at hourly intervals. The shortest characteristic time scales (as determined by the e-folding time of lagged autocorrelation functions) are those of ground heat fluxes and temperatures, precipitation and runoff, convective processes, cloud properties, and atmospheric vertical motion, while the longest time scales are exhibited by soil temperature and moisture, surface pressure, and atmospheric specific humidity, temperature, and wind. The time scales of surface heat and momentum fluxes and of convective processes are substantially shorter over land than over oceans. An appropriate sampling frequency for each model variable is obtained by comparing the estimates of first- and second-moment statistics determined at intervals ranging from 2 to 24 hours with the "best" estimates obtained from hourly sampling. Relatively accurate estimation of first- and second-moment climate statistics (10% errors in means, 20% errors in variances) can be achieved by sampling a model variable at intervals that usually are longer than the bandwidth of its time series but that often are shorter than its characteristic time scale. For the surface variables, sampling at intervals that are nonintegral divisors of a 24-hour day yields relatively more accurate time-mean statistics because of a reduction in errors associated with aliasing of the diurnal cycle and higher-frequency harmonics. The superior estimates of first-moment statistics are accompanied by inferior estimates of the variance of the daily means due to the presence of systematic biases, but these probably can be avoided by defining a different measure of low-frequency variability. Estimates of the intradiurnal variance of accumulated precipitation and surface runoff also are strongly impacted by the length of the storage interval. In light of these results, several alternative strategies for storage of the EMWF model variables are recommended.

  6. The impact of mother's literacy on child dental caries: Individual data or aggregate data analysis?

    PubMed

    Haghdoost, Ali-Akbar; Hessari, Hossein; Baneshi, Mohammad Reza; Rad, Maryam; Shahravan, Arash

    2017-01-01

    To evaluate the impact of mother's literacy on child dental caries based on a national oral health survey in Iran and to investigate the possibility of ecological fallacy in aggregate data analysis. Existing data were from second national oral health survey that was carried out in 2004, which including 8725 6 years old participants. The association of mother's literacy with caries occurrence (DMF (Decayed, Missing, Filling) total score >0) of her child was assessed using individual data by logistic regression model. Then the association of the percentages of mother's literacy and the percentages of decayed teeth in each 30 provinces of Iran was assessed using aggregated data retrieved from the data of second national oral health survey of Iran and alternatively from census of "Statistical Center of Iran" using linear regression model. The significance level was set at 0.05 for all analysis. Individual data analysis showed a statistically significant association between mother's literacy and decayed teeth of children ( P = 0.02, odds ratio = 0.83). There were not statistical significant association between mother's literacy and child dental caries in aggregate data analysis of oral health survey ( P = 0.79, B = 0.03) and census of "Statistical Center of Statistics" ( P = 0.60, B = 0.14). Literate mothers have a preventive effect on occurring dental caries of children. According to the high percentage of illiterate parents in Iran, it's logical to consider suitable methods of oral health education which do not need reading or writing. Aggregate data analysis and individual data analysis had completely different results in this study.

  7. Order-Constrained Reference Priors with Implications for Bayesian Isotonic Regression, Analysis of Covariance and Spatial Models

    NASA Astrophysics Data System (ADS)

    Gong, Maozhen

    Selecting an appropriate prior distribution is a fundamental issue in Bayesian Statistics. In this dissertation, under the framework provided by Berger and Bernardo, I derive the reference priors for several models which include: Analysis of Variance (ANOVA)/Analysis of Covariance (ANCOVA) models with a categorical variable under common ordering constraints, the conditionally autoregressive (CAR) models and the simultaneous autoregressive (SAR) models with a spatial autoregression parameter rho considered. The performances of reference priors for ANOVA/ANCOVA models are evaluated by simulation studies with comparisons to Jeffreys' prior and Least Squares Estimation (LSE). The priors are then illustrated in a Bayesian model of the "Risk of Type 2 Diabetes in New Mexico" data, where the relationship between the type 2 diabetes risk (through Hemoglobin A1c) and different smoking levels is investigated. In both simulation studies and real data set modeling, the reference priors that incorporate internal order information show good performances and can be used as default priors. The reference priors for the CAR and SAR models are also illustrated in the "1999 SAT State Average Verbal Scores" data with a comparison to a Uniform prior distribution. Due to the complexity of the reference priors for both CAR and SAR models, only a portion (12 states in the Midwest) of the original data set is considered. The reference priors can give a different marginal posterior distribution compared to a Uniform prior, which provides an alternative for prior specifications for areal data in Spatial statistics.

  8. Estimation of rates-across-sites distributions in phylogenetic substitution models.

    PubMed

    Susko, Edward; Field, Chris; Blouin, Christian; Roger, Andrew J

    2003-10-01

    Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates.

  9. Incorporating covariates into fisheries stock assessment models with application to Pacific herring.

    PubMed

    Deriso, Richard B; Maunder, Mark N; Pearson, Walter H

    2008-07-01

    We present a framework for evaluating the cause of fishery declines by integrating covariates into a fisheries stock assessment model. This allows the evaluation of fisheries' effects vs. natural and other human impacts. The analyses presented are based on integrating ecological science and statistics and form the basis for environmental decision-making advice. Hypothesis tests are described to rank hypotheses and determine the size of a multiple covariate model. We extend recent developments in integrated analysis and use novel methods to produce effect size estimates that are relevant to policy makers and include estimates of uncertainty. Results can be directly applied to evaluate trade-offs among alternative management decisions. The methods and results are also broadly applicable outside fisheries stock assessment. We show that multiple factors influence populations and that analysis of factors in isolation can be misleading. We illustrate the framework by applying it to Pacific herring of Prince William Sound, Alaska (USA). The Pacific herring stock that spawns in Prince William Sound is a stock that has collapsed, but there are several competing or alternative hypotheses to account for the initial collapse and subsequent lack of recovery. Factors failing the initial screening tests for statistical significance included indicators of the 1989 Exxon Valdez oil spill, coho salmon predation, sea lion predation, Pacific Decadal Oscillation, Northern Oscillation Index, and effects of containment in the herring egg-on-kelp pound fishery. The overall results indicate that the most statistically significant factors related to the lack of recovery of the herring stock involve competition or predation by juvenile hatchery pink salmon on herring juveniles. Secondary factors identified in the analysis were poor nutrition in the winter, ocean (Gulf of Alaska) temperature in the winter, the viral hemorrhagic septicemia virus, and the pathogen Ichthyophonus hoferi. The implication of this result to fisheries management in Prince William Sound is that it may well be difficult to simultaneously increase the production of pink salmon and maintain a viable Pacific herring fishery. The impact can be extended to other commercially important fisheries, and a whole ecosystem approach may be needed to evaluate the costs and benefits of salmon hatcheries.

  10. Longitudinal data analyses using linear mixed models in SPSS: concepts, procedures and illustrations.

    PubMed

    Shek, Daniel T L; Ma, Cecilia M S

    2011-01-05

    Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM) are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM) are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models) are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) in Hong Kong are presented.

  11. Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

    PubMed Central

    Shek, Daniel T. L.; Ma, Cecilia M. S.

    2011-01-01

    Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM) are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM) are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models) are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) in Hong Kong are presented. PMID:21218263

  12. Using spatiotemporal statistical models to estimate animal abundance and infer ecological dynamics from survey counts

    USGS Publications Warehouse

    Conn, Paul B.; Johnson, Devin S.; Ver Hoef, Jay M.; Hooten, Mevin B.; London, Joshua M.; Boveng, Peter L.

    2015-01-01

    Ecologists often fit models to survey data to estimate and explain variation in animal abundance. Such models typically require that animal density remains constant across the landscape where sampling is being conducted, a potentially problematic assumption for animals inhabiting dynamic landscapes or otherwise exhibiting considerable spatiotemporal variation in density. We review several concepts from the burgeoning literature on spatiotemporal statistical models, including the nature of the temporal structure (i.e., descriptive or dynamical) and strategies for dimension reduction to promote computational tractability. We also review several features as they specifically relate to abundance estimation, including boundary conditions, population closure, choice of link function, and extrapolation of predicted relationships to unsampled areas. We then compare a suite of novel and existing spatiotemporal hierarchical models for animal count data that permit animal density to vary over space and time, including formulations motivated by resource selection and allowing for closed populations. We gauge the relative performance (bias, precision, computational demands) of alternative spatiotemporal models when confronted with simulated and real data sets from dynamic animal populations. For the latter, we analyze spotted seal (Phoca largha) counts from an aerial survey of the Bering Sea where the quantity and quality of suitable habitat (sea ice) changed dramatically while surveys were being conducted. Simulation analyses suggested that multiple types of spatiotemporal models provide reasonable inference (low positive bias, high precision) about animal abundance, but have potential for overestimating precision. Analysis of spotted seal data indicated that several model formulations, including those based on a log-Gaussian Cox process, had a tendency to overestimate abundance. By contrast, a model that included a population closure assumption and a scale prior on total abundance produced estimates that largely conformed to our a priori expectation. Although care must be taken to tailor models to match the study population and survey data available, we argue that hierarchical spatiotemporal statistical models represent a powerful way forward for estimating abundance and explaining variation in the distribution of dynamical populations.

  13. Use of Vendedores (Mobile Food Vendors), Pulgas (Flea Markets), and Vecinos o Amigos (Neighbors or Friends) as Alternative Sources of Food for Purchase among Mexican-origin Households in Texas Border Colonias

    PubMed Central

    Sharkey, Joseph R.; Dean, Wesley R.; Johnson, Cassandra M.

    2012-01-01

    There is a paucity of studies acknowledging the existence of alternative food sources, and factors associated with food purchasing from three common alternative sources: vendedores (mobile food vendors), pulgas (flea markets), and vecinos/amigos (neighbors/friends). This analysis aims to examine the use of alternative food sources by Mexican-origin women from Texas-border colonias and determine factors associated with their use. The design was cross-sectional. Promotora-researchers (promotoras de salud trained in research methods) recruited 610 Mexican-origin women from 44 colonias and conducted in-person surveys. Surveys included participant characteristics and measures of food environment use and household food security. Statistical analyses included separate logistic regressions, modeled for food purchase from mobile food vendors, pulgas, or neighbors/friends (NFs). Child food insecurity was associated with purchasing food from mobile food vendors, while household food security was associated with using pulgas or NFs. School nutrition program participants were more likely to live in households that depend on alternative food sources. Efforts to increase healthful food consumption such as fruits and vegetables should acknowledge all potential food sources (traditional, convenience, non-traditional, and alternative), especially those preferred by colonia residents. Current findings support the conceptual broadening of the retail food environment, and the importance of linking use with spatial access (proximity) to more accurately depict access to food sources. PMID:22709775

  14. Clinical utility of the DSM-5 alternative model for borderline personality disorder: Differential diagnostic accuracy of the BFI, SCID-II-PQ, and PID-5.

    PubMed

    Fowler, J Christopher; Madan, Alok; Allen, Jon G; Patriquin, Michelle; Sharp, Carla; Oldham, John M; Frueh, B Christopher

    2018-01-01

    With the publication of DSM 5 alternative model for personality disorders it is critical to assess the components of the model against evidence-based models such as the five factor model and the DSM-IV-TR categorical model. This study explored the relative clinical utility of these models in screening for borderline personality disorder (BPD). Receiver operator characteristics and diagnostic efficiency statistics were calculated for three personality measures to ascertain the relative diagnostic efficiency of each measure. A total of 1653 adult inpatients at a specialist psychiatric hospital completed SCID-II interviews. Sample 1 (n=653) completed the SCID-II interviews, SCID-II Questionnaire (SCID-II-PQ) and the Big Five Inventory (BFI), while Sample 2 (n=1,000) completed the SCID-II interviews, Personality Inventory for DSM5 (PID-5) and the BFI. BFI measure evidenced moderate accuracy for two composites: High Neuroticism+ low agreeableness composite (AUC=0.72, SE=0.01, p<0.001) and High Neuroticism+ Low+Low Conscientiousness (AUC=0.73, SE=0.01, p<0.0001). The SCID-II-PQ evidenced moderate-to-excellent accuracy (AUC=0.86, SE=0.02, p<0.0001) with a good balance of specificity (SP=0.80) and sensitivity (SN=0.78). The PID-5 BPD algorithm (consisting of elevated emotional lability, anxiousness, separation insecurity, hostility, depressivity, impulsivity, and risk taking) evidenced moderate-to-excellent accuracy (AUC=0.87, SE=0.01, p<0.0001) with a good balance of specificity (SP=0.76) and sensitivity (SN=0.81). Findings generally support the use of SCID-II-PQ and PID-5 BPD algorithm for screening purposes. Furthermore, findings support the accuracy of the DSM 5 alternative model Criteria B trait constellation for diagnosing BPD. Limitations of the study include the single inpatient setting and use of two discrete samples to assess PID-5 and SCID-II-PQ. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Nonlinear Schrödinger approach to European option pricing

    NASA Astrophysics Data System (ADS)

    Wróblewski, Marcin

    2017-05-01

    This paper deals with numerical option pricing methods based on a Schrödinger model rather than the Black-Scholes model. Nonlinear Schrödinger boundary value problems seem to be alternatives to linear models which better reflect the complexity and behavior of real markets. Therefore, based on the nonlinear Schrödinger option pricing model proposed in the literature, in this paper a model augmented by external atomic potentials is proposed and numerically tested. In terms of statistical physics the developed model describes the option in analogy to a pair of two identical quantum particles occupying the same state. The proposed model is used to price European call options on a stock index. the model is calibrated using the Levenberg-Marquardt algorithm based on market data. A Runge-Kutta method is used to solve the discretized boundary value problem numerically. Numerical results are provided and discussed. It seems that our proposal more accurately models phenomena observed in the real market than do linear models.

  16. Finding Bounded Rational Equilibria. Part 2; Alternative Lagrangians and Uncountable Move Spaces

    NASA Technical Reports Server (NTRS)

    Wolpert, David H.

    2004-01-01

    A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality characterizing all real-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. It has recently been shown that the same information theoretic mathematical structure, known as Probability Collectives (PC) underlies both issues. This relationship between statistical physics and game theory allows techniques and insights &om the one field to be applied to the other. In particular, PC provides a formal model-independent definition of the degree of rationality of a player and of bounded rationality equilibria. This pair of papers extends previous work on PC by introducing new computational approaches to effectively find bounded rationality equilibria of common-interest (team) games.

  17. Can PC-9 Zhong chong replace K-1 Yong quan for the acupunctural resuscitation of a bilateral double-amputee? Stating the “random criterion problem” in its statistical analysis

    PubMed Central

    Inchauspe, Adrián Angel

    2016-01-01

    AIM: To present an inclusion criterion for patients who have suffered bilateral amputation in order to be treated with the supplementary resuscitation treatment which is hereby proposed by the author. METHODS: This work is based on a Retrospective Cohort model so that a certainly lethal risk to the control group is avoided. RESULTS: This paper presents a hypothesis on acupunctural PC-9 Zhong chong point, further supported by previous statistical work recorded for the K-1 Yong quan resuscitation point. CONCLUSION: Thanks to the application of the resuscitation maneuver herein proposed on the previously mentioned point, patients with bilateral amputation would have another alternative treatment available in case basic and advanced CPR should fail. PMID:27152257

  18. Meta-analysis using Dirichlet process.

    PubMed

    Muthukumarana, Saman; Tiwari, Ram C

    2016-02-01

    This article develops a Bayesian approach for meta-analysis using the Dirichlet process. The key aspect of the Dirichlet process in meta-analysis is the ability to assess evidence of statistical heterogeneity or variation in the underlying effects across study while relaxing the distributional assumptions. We assume that the study effects are generated from a Dirichlet process. Under a Dirichlet process model, the study effects parameters have support on a discrete space and enable borrowing of information across studies while facilitating clustering among studies. We illustrate the proposed method by applying it to a dataset on the Program for International Student Assessment on 30 countries. Results from the data analysis, simulation studies, and the log pseudo-marginal likelihood model selection procedure indicate that the Dirichlet process model performs better than conventional alternative methods. © The Author(s) 2012.

  19. Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways

    NASA Astrophysics Data System (ADS)

    Kohlhoff, Kai J.; Shukla, Diwakar; Lawrenz, Morgan; Bowman, Gregory R.; Konerding, David E.; Belov, Dan; Altman, Russ B.; Pande, Vijay S.

    2014-01-01

    Simulations can provide tremendous insight into the atomistic details of biological mechanisms, but micro- to millisecond timescales are historically only accessible on dedicated supercomputers. We demonstrate that cloud computing is a viable alternative that brings long-timescale processes within reach of a broader community. We used Google's Exacycle cloud-computing platform to simulate two milliseconds of dynamics of a major drug target, the G-protein-coupled receptor β2AR. Markov state models aggregate independent simulations into a single statistical model that is validated by previous computational and experimental results. Moreover, our models provide an atomistic description of the activation of a G-protein-coupled receptor and reveal multiple activation pathways. Agonists and inverse agonists interact differentially with these pathways, with profound implications for drug design.

  20. Statistical and engineering methods for model enhancement

    NASA Astrophysics Data System (ADS)

    Chang, Chia-Jung

    Models which describe the performance of physical process are essential for quality prediction, experimental planning, process control and optimization. Engineering models developed based on the underlying physics/mechanics of the process such as analytic models or finite element models are widely used to capture the deterministic trend of the process. However, there usually exists stochastic randomness in the system which may introduce the discrepancy between physics-based model predictions and observations in reality. Alternatively, statistical models can be used to develop models to obtain predictions purely based on the data generated from the process. However, such models tend to perform poorly when predictions are made away from the observed data points. This dissertation contributes to model enhancement research by integrating physics-based model and statistical model to mitigate the individual drawbacks and provide models with better accuracy by combining the strengths of both models. The proposed model enhancement methodologies including the following two streams: (1) data-driven enhancement approach and (2) engineering-driven enhancement approach. Through these efforts, more adequate models are obtained, which leads to better performance in system forecasting, process monitoring and decision optimization. Among different data-driven enhancement approaches, Gaussian Process (GP) model provides a powerful methodology for calibrating a physical model in the presence of model uncertainties. However, if the data contain systematic experimental errors, the GP model can lead to an unnecessarily complex adjustment of the physical model. In Chapter 2, we proposed a novel enhancement procedure, named as “Minimal Adjustment”, which brings the physical model closer to the data by making minimal changes to it. This is achieved by approximating the GP model by a linear regression model and then applying a simultaneous variable selection of the model and experimental bias terms. Two real examples and simulations are presented to demonstrate the advantages of the proposed approach. Different from enhancing the model based on data-driven perspective, an alternative approach is to focus on adjusting the model by incorporating the additional domain or engineering knowledge when available. This often leads to models that are very simple and easy to interpret. The concepts of engineering-driven enhancement are carried out through two applications to demonstrate the proposed methodologies. In the first application where polymer composite quality is focused, nanoparticle dispersion has been identified as a crucial factor affecting the mechanical properties. Transmission Electron Microscopy (TEM) images are commonly used to represent nanoparticle dispersion without further quantifications on its characteristics. In Chapter 3, we developed the engineering-driven nonhomogeneous Poisson random field modeling strategy to characterize nanoparticle dispersion status of nanocomposite polymer, which quantitatively represents the nanomaterial quality presented through image data. The model parameters are estimated through the Bayesian MCMC technique to overcome the challenge of limited amount of accessible data due to the time consuming sampling schemes. The second application is to calibrate the engineering-driven force models of laser-assisted micro milling (LAMM) process statistically, which facilitates a systematic understanding and optimization of targeted processes. In Chapter 4, the force prediction interval has been derived by incorporating the variability in the runout parameters as well as the variability in the measured cutting forces. The experimental results indicate that the model predicts the cutting force profile with good accuracy using a 95% confidence interval. To conclude, this dissertation is the research drawing attention to model enhancement, which has considerable impacts on modeling, design, and optimization of various processes and systems. The fundamental methodologies of model enhancement are developed and further applied to various applications. These research activities developed engineering compliant models for adequate system predictions based on observational data with complex variable relationships and uncertainty, which facilitate process planning, monitoring, and real-time control.

  1. Machine learning techniques for medical diagnosis of diabetes using iris images.

    PubMed

    Samant, Piyush; Agarwal, Ravinder

    2018-04-01

    Complementary and alternative medicine techniques have shown their potential for the treatment and diagnosis of chronical diseases like diabetes, arthritis etc. On the same time digital image processing techniques for disease diagnosis is reliable and fastest growing field in biomedical. Proposed model is an attempt to evaluate diagnostic validity of an old complementary and alternative medicine technique, iridology for diagnosis of type-2 diabetes using soft computing methods. Investigation was performed over a close group of total 338 subjects (180 diabetic and 158 non-diabetic). Infra-red images of both the eyes were captured simultaneously. The region of interest from the iris image was cropped as zone corresponds to the position of pancreas organ according to the iridology chart. Statistical, texture and discrete wavelength transformation features were extracted from the region of interest. The results show best classification accuracy of 89.63% calculated from RF classifier. Maximum specificity and sensitivity were absorbed as 0.9687 and 0.988, respectively. Results have revealed the effectiveness and diagnostic significance of proposed model for non-invasive and automatic diabetes diagnosis. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Valuing improved wetland quality using choice modeling

    NASA Astrophysics Data System (ADS)

    Morrison, Mark; Bennett, Jeff; Blamey, Russell

    1999-09-01

    The main stated preference technique used for estimating environmental values is the contingent valuation method. In this paper the results of an application of an alternative technique, choice modeling, are reported. Choice modeling has been developed in the marketing and transport applications but has only been used in a handful of environmental applications, most of which have focused on use values. The case study presented here involves the estimation of the nonuse environmental values provided by the Macquarie Marshes, a major wetland in New South Wales, Australia. Estimates of the nonuse value the community places on preventing job losses are also presented. The reported models are robust, having high explanatory power and variables that are statistically significant and consistent with expectations. These results provide support for the hypothesis that choice modeling can be used to estimate nonuse values for both environmental and social consequences of resource use changes.

  3. Measuring individual differences in responses to date-rape vignettes using latent variable models.

    PubMed

    Tuliao, Antover P; Hoffman, Lesa; McChargue, Dennis E

    2017-01-01

    Vignette methodology can be a flexible and powerful way to examine individual differences in response to dangerous real-life scenarios. However, most studies underutilize the usefulness of such methodology by analyzing only one outcome, which limits the ability to track event-related changes (e.g., vacillation in risk perception). The current study was designed to illustrate the dynamic influence of risk perception on exit point from a date-rape vignette. Our primary goal was to provide an illustrative example of how to use latent variable models for vignette methodology, including latent growth curve modeling with piecewise slopes, as well as latent variable measurement models. Through the combination of a step-by-step exposition in this text and corresponding model syntax available electronically, we detail an alternative statistical "blueprint" to enhance future violence research efforts using vignette methodology. Aggr. Behav. 43:60-73, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  4. Modeling and forecasting the distribution of Vibrio vulnificus in Chesapeake Bay.

    PubMed

    Jacobs, J M; Rhodes, M; Brown, C W; Hood, R R; Leight, A; Long, W; Wood, R

    2014-11-01

    To construct statistical models to predict the presence, abundance and potential virulence of Vibrio vulnificus in surface waters of Chesapeake Bay for implementation in ecological forecasting systems. We evaluated and applied previously published qPCR assays to water samples (n = 1636) collected from Chesapeake Bay from 2007-2010 in conjunction with State water quality monitoring programmes. A variety of statistical techniques were used in concert to identify water quality parameters associated with V. vulnificus presence, abundance and virulence markers in the interest of developing strong predictive models for use in regional oceanographic modeling systems. A suite of models are provided to represent the best model fit and alternatives using environmental variables that allow them to be put to immediate use in current ecological forecasting efforts. Environmental parameters such as temperature, salinity and turbidity are capable of accurately predicting abundance and distribution of V. vulnificus in Chesapeake Bay. Forcing these empirical models with output from ocean modeling systems allows for spatially explicit forecasts for up to 48 h in the future. This study uses one of the largest data sets compiled to model Vibrio in an estuary, enhances our understanding of environmental correlates with abundance, distribution and presence of potentially virulent strains and offers a method to forecast these pathogens that may be replicated in other regions. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

  5. Research Designs for Intervention Research with Small Samples II: Stepped Wedge and Interrupted Time-Series Designs.

    PubMed

    Fok, Carlotta Ching Ting; Henry, David; Allen, James

    2015-10-01

    The stepped wedge design (SWD) and the interrupted time-series design (ITSD) are two alternative research designs that maximize efficiency and statistical power with small samples when contrasted to the operating characteristics of conventional randomized controlled trials (RCT). This paper provides an overview and introduction to previous work with these designs and compares and contrasts them with the dynamic wait-list design (DWLD) and the regression point displacement design (RPDD), which were presented in a previous article (Wyman, Henry, Knoblauch, and Brown, Prevention Science. 2015) in this special section. The SWD and the DWLD are similar in that both are intervention implementation roll-out designs. We discuss similarities and differences between the SWD and DWLD in their historical origin and application, along with differences in the statistical modeling of each design. Next, we describe the main design characteristics of the ITSD, along with some of its strengths and limitations. We provide a critical comparative review of strengths and weaknesses in application of the ITSD, SWD, DWLD, and RPDD as small sample alternatives to application of the RCT, concluding with a discussion of the types of contextual factors that influence selection of an optimal research design by prevention researchers working with small samples.

  6. Research Designs for Intervention Research with Small Samples II: Stepped Wedge and Interrupted Time-Series Designs

    PubMed Central

    Ting Fok, Carlotta Ching; Henry, David; Allen, James

    2015-01-01

    The stepped wedge design (SWD) and the interrupted time-series design (ITSD) are two alternative research designs that maximize efficiency and statistical power with small samples when contrasted to the operating characteristics of conventional randomized controlled trials (RCT). This paper provides an overview and introduction to previous work with these designs, and compares and contrasts them with the dynamic wait-list design (DWLD) and the regression point displacement design (RPDD), which were presented in a previous article (Wyman, Henry, Knoblauch, and Brown, 2015) in this Special Section. The SWD and the DWLD are similar in that both are intervention implementation roll-out designs. We discuss similarities and differences between the SWD and DWLD in their historical origin and application, along with differences in the statistical modeling of each design. Next, we describe the main design characteristics of the ITSD, along with some of its strengths and limitations. We provide a critical comparative review of strengths and weaknesses in application of the ITSD, SWD, DWLD, and RPDD as small samples alternatives to application of the RCT, concluding with a discussion of the types of contextual factors that influence selection of an optimal research design by prevention researchers working with small samples. PMID:26017633

  7. Comparison of different synthetic 5-min rainfall time series regarding their suitability for urban drainage modelling

    NASA Astrophysics Data System (ADS)

    van der Heijden, Sven; Callau Poduje, Ana; Müller, Hannes; Shehu, Bora; Haberlandt, Uwe; Lorenz, Manuel; Wagner, Sven; Kunstmann, Harald; Müller, Thomas; Mosthaf, Tobias; Bárdossy, András

    2015-04-01

    For the design and operation of urban drainage systems with numerical simulation models, long, continuous precipitation time series with high temporal resolution are necessary. Suitable observed time series are rare. As a result, intelligent design concepts often use uncertain or unsuitable precipitation data, which renders them uneconomic or unsustainable. An expedient alternative to observed data is the use of long, synthetic rainfall time series as input for the simulation models. Within the project SYNOPSE, several different methods to generate synthetic precipitation data for urban drainage modelling are advanced, tested, and compared. The presented study compares four different approaches of precipitation models regarding their ability to reproduce rainfall and runoff characteristics. These include one parametric stochastic model (alternating renewal approach), one non-parametric stochastic model (resampling approach), one downscaling approach from a regional climate model, and one disaggregation approach based on daily precipitation measurements. All four models produce long precipitation time series with a temporal resolution of five minutes. The synthetic time series are first compared to observed rainfall reference time series. Comparison criteria include event based statistics like mean dry spell and wet spell duration, wet spell amount and intensity, long term means of precipitation sum and number of events, and extreme value distributions for different durations. Then they are compared regarding simulated discharge characteristics using an urban hydrological model on a fictitious sewage network. First results show a principal suitability of all rainfall models but with different strengths and weaknesses regarding the different rainfall and runoff characteristics considered.

  8. A Robust Alternative to the Normal Distribution.

    DTIC Science & Technology

    1982-07-07

    for any Purpose of the United States Governuent DEPARTMENT OF STATISTICS t -, STANFORD UIVERSITY I STANFORD, CALIFORNIA A Robust Alternative to the...Stanford University Technical Report No. 3. [5] Bhattacharya, S. K. (1966). A Modified Bessel Function lodel in Life Testing. Metrika 10, 133-144

  9. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-08-18

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. Copyright © 2015 Montesinos-López et al.

  10. Improving stability of regional numerical ocean models

    NASA Astrophysics Data System (ADS)

    Herzfeld, Mike

    2009-02-01

    An operational limited-area ocean modelling system was developed to supply forecasts of ocean state out to 3 days. This system is designed to allow non-specialist users to locate the model domain anywhere within the Australasian region with minimum user input. The model is required to produce a stable simulation every time it is invoked. This paper outlines the methodology used to ensure the model remains stable over the wide range of circumstances it might encounter. Central to the model configuration is an alternative approach to implementing open boundary conditions in a one-way nesting environment. Approximately 170 simulations were performed on limited areas in the Australasian region to assess the model stability; of these, 130 ran successfully with a static model parameterisation allowing a statistical estimate of the model’s approach toward instability to be determined. Based on this, when the model was deemed to be approaching instability a strategy of adaptive intervention in the form of constraint on velocity and elevation was invoked to maintain stability.

  11. Statistical modeling, detection, and segmentation of stains in digitized fabric images

    NASA Astrophysics Data System (ADS)

    Gururajan, Arunkumar; Sari-Sarraf, Hamed; Hequet, Eric F.

    2007-02-01

    This paper will describe a novel and automated system based on a computer vision approach, for objective evaluation of stain release on cotton fabrics. Digitized color images of the stained fabrics are obtained, and the pixel values in the color and intensity planes of these images are probabilistically modeled as a Gaussian Mixture Model (GMM). Stain detection is posed as a decision theoretic problem, where the null hypothesis corresponds to absence of a stain. The null hypothesis and the alternate hypothesis mathematically translate into a first order GMM and a second order GMM respectively. The parameters of the GMM are estimated using a modified Expectation-Maximization (EM) algorithm. Minimum Description Length (MDL) is then used as the test statistic to decide the verity of the null hypothesis. The stain is then segmented by a decision rule based on the probability map generated by the EM algorithm. The proposed approach was tested on a dataset of 48 fabric images soiled with stains of ketchup, corn oil, mustard, ragu sauce, revlon makeup and grape juice. The decision theoretic part of the algorithm produced a correct detection rate (true positive) of 93% and a false alarm rate of 5% on these set of images.

  12. Model for interevent times with long tails and multifractality in human communications: An application to financial trading

    NASA Astrophysics Data System (ADS)

    Perelló, Josep; Masoliver, Jaume; Kasprzak, Andrzej; Kutner, Ryszard

    2008-09-01

    Social, technological, and economic time series are divided by events which are usually assumed to be random, albeit with some hierarchical structure. It is well known that the interevent statistics observed in these contexts differs from the Poissonian profile by being long-tailed distributed with resting and active periods interwoven. Understanding mechanisms generating consistent statistics has therefore become a central issue. The approach we present is taken from the continuous-time random-walk formalism and represents an analytical alternative to models of nontrivial priority that have been recently proposed. Our analysis also goes one step further by looking at the multifractal structure of the interevent times of human decisions. We here analyze the intertransaction time intervals of several financial markets. We observe that empirical data describe a subtle multifractal behavior. Our model explains this structure by taking the pausing-time density in the form of a superstatistics where the integral kernel quantifies the heterogeneous nature of the executed tasks. A stretched exponential kernel provides a multifractal profile valid for a certain limited range. A suggested heuristic analytical profile is capable of covering a broader region.

  13. Artificial neural networks in gynaecological diseases: current and potential future applications.

    PubMed

    Siristatidis, Charalampos S; Chrelias, Charalampos; Pouliakis, Abraham; Katsimanis, Evangelos; Kassanos, Dimitrios

    2010-10-01

    Current (and probably future) practice of medicine is mostly associated with prediction and accurate diagnosis. Especially in clinical practice, there is an increasing interest in constructing and using valid models of diagnosis and prediction. Artificial neural networks (ANNs) are mathematical systems being used as a prospective tool for reliable, flexible and quick assessment. They demonstrate high power in evaluating multifactorial data, assimilating information from multiple sources and detecting subtle and complex patterns. Their capability and difference from other statistical techniques lies in performing nonlinear statistical modelling. They represent a new alternative to logistic regression, which is the most commonly used method for developing predictive models for outcomes resulting from partitioning in medicine. In combination with the other non-algorithmic artificial intelligence techniques, they provide useful software engineering tools for the development of systems in quantitative medicine. Our paper first presents a brief introduction to ANNs, then, using what we consider the best available evidence through paradigms, we evaluate the ability of these networks to serve as first-line detection and prediction techniques in some of the most crucial fields in gynaecology. Finally, through the analysis of their current application, we explore their dynamics for future use.

  14. Does RAIM with Correct Exclusion Produce Unbiased Positions?

    PubMed Central

    Teunissen, Peter J. G.; Imparato, Davide; Tiberius, Christian C. J. M.

    2017-01-01

    As the navigation solution of exclusion-based RAIM follows from a combination of least-squares estimation and a statistically based exclusion-process, the computation of the integrity of the navigation solution has to take the propagated uncertainty of the combined estimation-testing procedure into account. In this contribution, we analyse, theoretically as well as empirically, the effect that this combination has on the first statistical moment, i.e., the mean, of the computed navigation solution. It will be shown, although statistical testing is intended to remove biases from the data, that biases will always remain under the alternative hypothesis, even when the correct alternative hypothesis is properly identified. The a posteriori exclusion of a biased satellite range from the position solution will therefore never remove the bias in the position solution completely. PMID:28672862

  15. Convergence between DSM-IV-TR and DSM-5 diagnostic models for personality disorder: evaluation of strategies for establishing diagnostic thresholds.

    PubMed

    Morey, Leslie C; Skodol, Andrew E

    2013-05-01

    The Personality and Personality Disorders Work Group for the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) recommended substantial revisions to the personality disorders (PDs) section of DSM-IV-TR, proposing a hybrid categorical-dimensional model that represented PDs as combinations of core personality dysfunctions and various configurations of maladaptive personality traits. Although the DSM-5 Task Force endorsed the proposal, the Board of Trustees of the American Psychiatric Association (APA) did not, placing the Work Group's model in DSM-5 Section III ("Emerging Measures and Models") with other concepts thought to be in need of additional research. This paper documents the impact of using this alternative model in a national sample of 337 patients as described by clinicians familiar with their cases. In particular, the analyses focus on alternative strategies considered by the Work Group for deriving decision rules, or diagnostic thresholds, with which to assign categorical diagnoses. Results demonstrate that diagnostic rules could be derived that yielded appreciable correspondence between DSM-IV-TR and proposed DSM-5 PD diagnoses-correspondence greater than that observed in the transition between DSM-III and DSM-III-R PDs. The approach also represents the most comprehensive attempt to date to provide conceptual and empirical justification for diagnostic thresholds utilized within the DSM PDs.

  16. On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve.

    PubMed

    Eng, Kevin H; Schiller, Emily; Morrell, Kayla

    2015-11-03

    Researchers developing biomarkers for cancer prognosis from quantitative gene expression data are often faced with an odd methodological discrepancy: while Cox's proportional hazards model, the appropriate and popular technique, produces a continuous and relative risk score, it is hard to cast the estimate in clear clinical terms like median months of survival and percent of patients affected. To produce a familiar Kaplan-Meier plot, researchers commonly make the decision to dichotomize a continuous (often unimodal and symmetric) score. It is well known in the statistical literature that this procedure induces significant bias. We illustrate the liabilities of common techniques for categorizing a risk score and discuss alternative approaches. We promote the use of the restricted mean survival (RMS) and the corresponding RMS curve that may be thought of as an analog to the best fit line from simple linear regression. Continuous biomarker workflows should be modified to include the more rigorous statistical techniques and descriptive plots described in this article. All statistics discussed can be computed via standard functions in the Survival package of the R statistical programming language. Example R language code for the RMS curve is presented in the appendix.

  17. Adaptive filtering in biological signal processing.

    PubMed

    Iyer, V K; Ploysongsang, Y; Ramamoorthy, P A

    1990-01-01

    The high dependence of conventional optimal filtering methods on the a priori knowledge of the signal and noise statistics render them ineffective in dealing with signals whose statistics cannot be predetermined accurately. Adaptive filtering methods offer a better alternative, since the a priori knowledge of statistics is less critical, real time processing is possible, and the computations are less expensive for this approach. Adaptive filtering methods compute the filter coefficients "on-line", converging to the optimal values in the least-mean square (LMS) error sense. Adaptive filtering is therefore apt for dealing with the "unknown" statistics situation and has been applied extensively in areas like communication, speech, radar, sonar, seismology, and biological signal processing and analysis for channel equalization, interference and echo canceling, line enhancement, signal detection, system identification, spectral analysis, beamforming, modeling, control, etc. In this review article adaptive filtering in the context of biological signals is reviewed. An intuitive approach to the underlying theory of adaptive filters and its applicability are presented. Applications of the principles in biological signal processing are discussed in a manner that brings out the key ideas involved. Current and potential future directions in adaptive biological signal processing are also discussed.

  18. Normal Approximations to the Distributions of the Wilcoxon Statistics: Accurate to What "N"? Graphical Insights

    ERIC Educational Resources Information Center

    Bellera, Carine A.; Julien, Marilyse; Hanley, James A.

    2010-01-01

    The Wilcoxon statistics are usually taught as nonparametric alternatives for the 1- and 2-sample Student-"t" statistics in situations where the data appear to arise from non-normal distributions, or where sample sizes are so small that we cannot check whether they do. In the past, critical values, based on exact tail areas, were…

  19. Novel Kalman filter algorithm for statistical monitoring of extensive landscapes with synoptic sensor data

    Treesearch

    Raymond L. Czaplewski

    2015-01-01

    Wall-to-wall remotely sensed data are increasingly available to monitor landscape dynamics over large geographic areas. However, statistical monitoring programs that use post-stratification cannot fully utilize those sensor data. The Kalman filter (KF) is an alternative statistical estimator. I develop a new KF algorithm that is numerically robust with large numbers of...

  20. Noncentral Chi-Square versus Normal Distributions in Describing the Likelihood Ratio Statistic: The Univariate Case and Its Multivariate Implication

    ERIC Educational Resources Information Center

    Yuan, Ke-Hai

    2008-01-01

    In the literature of mean and covariance structure analysis, noncentral chi-square distribution is commonly used to describe the behavior of the likelihood ratio (LR) statistic under alternative hypothesis. Due to the inaccessibility of the rather technical literature for the distribution of the LR statistic, it is widely believed that the…

  1. Predicting Acquisition of Learning Outcomes: A Comparison of Traditional and Activity-Based Instruction in an Introductory Statistics Course.

    ERIC Educational Resources Information Center

    Geske, Jenenne A.; Mickelson, William T.; Bandalos, Deborah L.; Jonson, Jessica; Smith, Russell W.

    The bulk of experimental research related to reforms in the teaching of statistics concentrates on the effects of alternative teaching methods on statistics achievement. This study expands on that research by including an examination of the effects of instructor and the interaction between instructor and method on achievement as well as attitudes,…

  2. Hydrologic controls on aperiodic spatial organization of the ridge-slough patterned landscape

    NASA Astrophysics Data System (ADS)

    Casey, Stephen T.; Cohen, Matthew J.; Acharya, Subodh; Kaplan, David A.; Jawitz, James W.

    2016-11-01

    A century of hydrologic modification has altered the physical and biological drivers of landscape processes in the Everglades (Florida, USA). Restoring the ridge-slough patterned landscape, a dominant feature of the historical system, is a priority but requires an understanding of pattern genesis and degradation mechanisms. Physical experiments to evaluate alternative pattern formation mechanisms are limited by the long timescales of peat accumulation and loss, necessitating model-based comparisons, where support for a particular mechanism is based on model replication of extant patterning and trajectories of degradation. However, multiple mechanisms yield a central feature of ridge-slough patterning (patch elongation in the direction of historical flow), limiting the utility of that characteristic for discriminating among alternatives. Using data from vegetation maps, we investigated the statistical features of ridge-slough spatial patterning (ridge density, patch perimeter, elongation, patch size distributions, and spatial periodicity) to establish more rigorous criteria for evaluating model performance and to inform controls on pattern variation across the contemporary system. Mean water depth explained significant variation in ridge density, total perimeter, and length : width ratios, illustrating an important pattern response to existing hydrologic gradients. Two independent analyses (2-D periodograms and patch size distributions) provide strong evidence against regular patterning, with the landscape exhibiting neither a characteristic wavelength nor a characteristic patch size, both of which are expected under conditions that produce regular patterns. Rather, landscape properties suggest robust scale-free patterning, indicating genesis from the coupled effects of local facilitation and a global negative feedback operating uniformly at the landscape scale. Critically, this challenges widespread invocation of scale-dependent negative feedbacks for explaining ridge-slough pattern origins. These results help discern among genesis mechanisms and provide an improved statistical description of the landscape that can be used to compare among model outputs, as well as to assess the success of future restoration projects.

  3. Anatomy of the Higgs fits: A first guide to statistical treatments of the theoretical uncertainties

    NASA Astrophysics Data System (ADS)

    Fichet, Sylvain; Moreau, Grégory

    2016-04-01

    The studies of the Higgs boson couplings based on the recent and upcoming LHC data open up a new window on physics beyond the Standard Model. In this paper, we propose a statistical guide to the consistent treatment of the theoretical uncertainties entering the Higgs rate fits. Both the Bayesian and frequentist approaches are systematically analysed in a unified formalism. We present analytical expressions for the marginal likelihoods, useful to implement simultaneously the experimental and theoretical uncertainties. We review the various origins of the theoretical errors (QCD, EFT, PDF, production mode contamination…). All these individual uncertainties are thoroughly combined with the help of moment-based considerations. The theoretical correlations among Higgs detection channels appear to affect the location and size of the best-fit regions in the space of Higgs couplings. We discuss the recurrent question of the shape of the prior distributions for the individual theoretical errors and find that a nearly Gaussian prior arises from the error combinations. We also develop the bias approach, which is an alternative to marginalisation providing more conservative results. The statistical framework to apply the bias principle is introduced and two realisations of the bias are proposed. Finally, depending on the statistical treatment, the Standard Model prediction for the Higgs signal strengths is found to lie within either the 68% or 95% confidence level region obtained from the latest analyses of the 7 and 8 TeV LHC datasets.

  4. Late paleozoic fusulinoidean gigantism driven by atmospheric hyperoxia.

    PubMed

    Payne, Jonathan L; Groves, John R; Jost, Adam B; Nguyen, Thienan; Moffitt, Sarah E; Hill, Tessa M; Skotheim, Jan M

    2012-09-01

    Atmospheric hyperoxia, with pO(2) in excess of 30%, has long been hypothesized to account for late Paleozoic (360-250 million years ago) gigantism in numerous higher taxa. However, this hypothesis has not been evaluated statistically because comprehensive size data have not been compiled previously at sufficient temporal resolution to permit quantitative analysis. In this study, we test the hyperoxia-gigantism hypothesis by examining the fossil record of fusulinoidean foraminifers, a dramatic example of protistan gigantism with some individuals exceeding 10 cm in length and exceeding their relatives by six orders of magnitude in biovolume. We assembled and examined comprehensive regional and global, species-level datasets containing 270 and 1823 species, respectively. A statistical model of size evolution forced by atmospheric pO(2) is conclusively favored over alternative models based on random walks or a constant tendency toward size increase. Moreover, the ratios of volume to surface area in the largest fusulinoideans are consistent in magnitude and trend with a mathematical model based on oxygen transport limitation. We further validate the hyperoxia-gigantism model through an examination of modern foraminiferal species living along a measured gradient in oxygen concentration. These findings provide the first quantitative confirmation of a direct connection between Paleozoic gigantism and atmospheric hyperoxia. © 2012 The Author(s). Evolution© 2012 The Society for the Study of Evolution.

  5. 49 CFR Appendix B to Part 222 - Alternative Safety Measures

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... statistically valid baseline violation rate must be established through automated or systematic manual... enforcement, a program of public education and awareness directed at motor vehicle drivers, pedestrians and..., a statistically valid baseline violation rate must be established through automated or systematic...

  6. 49 CFR Appendix B to Part 222 - Alternative Safety Measures

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... statistically valid baseline violation rate must be established through automated or systematic manual... enforcement, a program of public education and awareness directed at motor vehicle drivers, pedestrians and..., a statistically valid baseline violation rate must be established through automated or systematic...

  7. On approaches to analyze the sensitivity of simulated hydrologic fluxes to model parameters in the community land model

    DOE PAGES

    Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...

    2015-12-04

    Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less

  8. Comment on Pisarenko et al., "Characterization of the Tail of the Distribution of Earthquake Magnitudes by Combining the GEV and GPD Descriptions of Extreme Value Theory"

    NASA Astrophysics Data System (ADS)

    Raschke, Mathias

    2016-02-01

    In this short note, I comment on the research of Pisarenko et al. (Pure Appl. Geophys 171:1599-1624, 2014) regarding the extreme value theory and statistics in the case of earthquake magnitudes. The link between the generalized extreme value distribution (GEVD) as an asymptotic model for the block maxima of a random variable and the generalized Pareto distribution (GPD) as a model for the peaks over threshold (POT) of the same random variable is presented more clearly. Inappropriately, Pisarenko et al. (Pure Appl. Geophys 171:1599-1624, 2014) have neglected to note that the approximations by GEVD and GPD work only asymptotically in most cases. This is particularly the case with truncated exponential distribution (TED), a popular distribution model for earthquake magnitudes. I explain why the classical models and methods of the extreme value theory and statistics do not work well for truncated exponential distributions. Consequently, these classical methods should be used for the estimation of the upper bound magnitude and corresponding parameters. Furthermore, I comment on various issues of statistical inference in Pisarenko et al. and propose alternatives. I argue why GPD and GEVD would work for various types of stochastic earthquake processes in time, and not only for the homogeneous (stationary) Poisson process as assumed by Pisarenko et al. (Pure Appl. Geophys 171:1599-1624, 2014). The crucial point of earthquake magnitudes is the poor convergence of their tail distribution to the GPD, and not the earthquake process over time.

  9. Investigation of Error Patterns in Geographical Databases

    NASA Technical Reports Server (NTRS)

    Dryer, David; Jacobs, Derya A.; Karayaz, Gamze; Gronbech, Chris; Jones, Denise R. (Technical Monitor)

    2002-01-01

    The objective of the research conducted in this project is to develop a methodology to investigate the accuracy of Airport Safety Modeling Data (ASMD) using statistical, visualization, and Artificial Neural Network (ANN) techniques. Such a methodology can contribute to answering the following research questions: Over a representative sampling of ASMD databases, can statistical error analysis techniques be accurately learned and replicated by ANN modeling techniques? This representative ASMD sample should include numerous airports and a variety of terrain characterizations. Is it possible to identify and automate the recognition of patterns of error related to geographical features? Do such patterns of error relate to specific geographical features, such as elevation or terrain slope? Is it possible to combine the errors in small regions into an error prediction for a larger region? What are the data density reduction implications of this work? ASMD may be used as the source of terrain data for a synthetic visual system to be used in the cockpit of aircraft when visual reference to ground features is not possible during conditions of marginal weather or reduced visibility. In this research, United States Geologic Survey (USGS) digital elevation model (DEM) data has been selected as the benchmark. Artificial Neural Networks (ANNS) have been used and tested as alternate methods in place of the statistical methods in similar problems. They often perform better in pattern recognition, prediction and classification and categorization problems. Many studies show that when the data is complex and noisy, the accuracy of ANN models is generally higher than those of comparable traditional methods.

  10. Performance Metrics, Error Modeling, and Uncertainty Quantification

    NASA Technical Reports Server (NTRS)

    Tian, Yudong; Nearing, Grey S.; Peters-Lidard, Christa D.; Harrison, Kenneth W.; Tang, Ling

    2016-01-01

    A common set of statistical metrics has been used to summarize the performance of models or measurements-­ the most widely used ones being bias, mean square error, and linear correlation coefficient. They assume linear, additive, Gaussian errors, and they are interdependent, incomplete, and incapable of directly quantifying un­certainty. The authors demonstrate that these metrics can be directly derived from the parameters of the simple linear error model. Since a correct error model captures the full error information, it is argued that the specification of a parametric error model should be an alternative to the metrics-based approach. The error-modeling meth­odology is applicable to both linear and nonlinear errors, while the metrics are only meaningful for linear errors. In addition, the error model expresses the error structure more naturally, and directly quantifies uncertainty. This argument is further explained by highlighting the intrinsic connections between the performance metrics, the error model, and the joint distribution between the data and the reference.

  11. Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches.

    PubMed

    Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S

    2017-08-01

    Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.

  12. Accounting for heterogeneity in meta-analysis using a multiplicative model-an empirical study.

    PubMed

    Mawdsley, David; Higgins, Julian P T; Sutton, Alex J; Abrams, Keith R

    2017-03-01

    In meta-analysis, the random-effects model is often used to account for heterogeneity. The model assumes that heterogeneity has an additive effect on the variance of effect sizes. An alternative model, which assumes multiplicative heterogeneity, has been little used in the medical statistics community, but is widely used by particle physicists. In this paper, we compare the two models using a random sample of 448 meta-analyses drawn from the Cochrane Database of Systematic Reviews. In general, differences in goodness of fit are modest. The multiplicative model tends to give results that are closer to the null, with a narrower confidence interval. Both approaches make different assumptions about the outcome of the meta-analysis. In our opinion, the selection of the more appropriate model will often be guided by whether the multiplicative model's assumption of a single effect size is plausible. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Alternatives for the Disruptive and Delinquent: New Systems or New Teachers?

    ERIC Educational Resources Information Center

    Bell, Raymond

    1975-01-01

    No one would disagree that delinquency and more violent crimes are increasing in the nation's schools. To combat the grim statistics, this author has some concrete suggestions. If your school is considering alternative programs for the alienated, here are some pitfalls to avoid. (Editor)

  14. Effect of Alternate Nostril Breathing Exercise on Experimentally Induced Anxiety in Healthy Volunteers Using the Simulated Public Speaking Model: A Randomized Controlled Pilot Study.

    PubMed

    Kamath, Ashwin; Urval, Rathnakar P; Shenoy, Ashok K

    2017-01-01

    A randomized controlled pilot study was carried out to determine the effect of a 15-minute practice of ANB exercise on experimentally induced anxiety using the simulated public speaking model in yoga-naïve healthy young adults. Thirty consenting medical students were equally divided into test and control groups. The test group performed alternate nostril breathing exercise for 15 minutes, while the control group sat in a quiet room before participating in the simulated public speaking test (SPST). Visual Analog Mood Scale and Self-Statements during Public Speaking scale were used to measure the mood state at different phases of the SPST. The psychometric scores of both groups were comparable at baseline. Repeated-measures ANOVA showed a significant effect of phase ( p < 0.05), but group and gender did not have statistically significant influence on the mean anxiety scores. However, the test group showed a trend towards lower mean scores for the anxiety factor when compared with the control group. Considering the limitations of this pilot study and the trend seen towards lower anxiety in the test group, alternate nostril breathing may have potential anxiolytic effect in acute stressful situations. A study with larger sample size is therefore warranted. This trial is registered with CTRI/2014/03/004460.

  15. Effect of Alternate Nostril Breathing Exercise on Experimentally Induced Anxiety in Healthy Volunteers Using the Simulated Public Speaking Model: A Randomized Controlled Pilot Study

    PubMed Central

    Urval, Rathnakar P.; Shenoy, Ashok K.

    2017-01-01

    A randomized controlled pilot study was carried out to determine the effect of a 15-minute practice of ANB exercise on experimentally induced anxiety using the simulated public speaking model in yoga-naïve healthy young adults. Thirty consenting medical students were equally divided into test and control groups. The test group performed alternate nostril breathing exercise for 15 minutes, while the control group sat in a quiet room before participating in the simulated public speaking test (SPST). Visual Analog Mood Scale and Self-Statements during Public Speaking scale were used to measure the mood state at different phases of the SPST. The psychometric scores of both groups were comparable at baseline. Repeated-measures ANOVA showed a significant effect of phase (p < 0.05), but group and gender did not have statistically significant influence on the mean anxiety scores. However, the test group showed a trend towards lower mean scores for the anxiety factor when compared with the control group. Considering the limitations of this pilot study and the trend seen towards lower anxiety in the test group, alternate nostril breathing may have potential anxiolytic effect in acute stressful situations. A study with larger sample size is therefore warranted. This trial is registered with CTRI/2014/03/004460. PMID:29159176

  16. Sample size estimation for alternating logistic regressions analysis of multilevel randomized community trials of under-age drinking.

    PubMed

    Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark

    2012-07-01

    Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure.

  17. Cypriot nurses' knowledge and attitudes towards alternative medicine.

    PubMed

    Zoe, Roupa; Charalambous, Charalambos; Popi, Sotiropoulou; Maria, Rekleiti; Aris, Vasilopoulos; Agoritsa, Koulouri; Evangelia, Kotrotsiou

    2014-02-01

    To investigate Cypriot nurses' knowledge and attitude towards alternative treatments. Two hundred randomly selected registered Nurses from public hospitals in Cyprus were administered an anonymous self-report questionnaire with closed-type questions. The particular questionnaire has previously been used in similar surveys. Six questions referred to demographic data and 14 questions to attitudes and knowledge towards alternative medicine. One hundred and thirty-eight questionnaires were adequately completed and evaluated. Descriptive and inferential statistics was performed. SPSS 17.0 was used. Statistical significance was set at p < 0.05. Over 1/3 of our sample nurses reported that they had turned to some form of alternative treatment at some point in their lives in order to deal with a certain medical situation. Most of these nurses who reported some knowledge on specific alternative treatment methods, (75.9%) also reported using such methods within their clinical practice. The nurses who had received some form of alternative treatment reported using them more often in their clinical practice, in comparison to those who had never received such treatment (Mann-Whitney U = 1137, p = 0.006). The more frequently nurses used alternative treatment in their clinical practice, the more interested they got in expanding their knowledge on the subject (Pearson's r = 0.250, p = 0.006). Most nurses are familiar with alternative medicine and interested in expanding their knowledge on subject, despite the fact that they do not usually practice it. Special education and training as well as legislative actions are necessary for alternative medicine to be broadly accepted. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. 4D-Fingerprint Categorical QSAR Models for Skin Sensitization Based on Classification Local Lymph Node Assay Measures

    PubMed Central

    Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.

    2008-01-01

    Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934

  19. Automated Statistical Forecast Method to 36-48H ahead of Storm Wind and Dangerous Precipitation at the Mediterranean Region

    NASA Astrophysics Data System (ADS)

    Perekhodtseva, E. V.

    2009-09-01

    Development of successful method of forecast of storm winds, including squalls and tornadoes and heavy rainfalls, that often result in human and material losses, could allow one to take proper measures against destruction of buildings and to protect people. Well-in-advance successful forecast (from 12 hours to 48 hour) makes possible to reduce the losses. Prediction of the phenomena involved is a very difficult problem for synoptic till recently. The existing graphic and calculation methods still depend on subjective decision of an operator. Nowadays in Russia there is no hydrodynamic model for forecast of the maximal precipitation and wind velocity V> 25m/c, hence the main tools of objective forecast are statistical methods using the dependence of the phenomena involved on a number of atmospheric parameters (predictors). Statistical decisive rule of the alternative and probability forecast of these events was obtained in accordance with the concept of "perfect prognosis" using the data of objective analysis. For this purpose the different teaching samples of present and absent of this storm wind and rainfalls were automatically arranged that include the values of forty physically substantiated potential predictors. Then the empirical statistical method was used that involved diagonalization of the mean correlation matrix R of the predictors and extraction of diagonal blocks of strongly correlated predictors. Thus for these phenomena the most informative predictors were selected without loosing information. The statistical decisive rules for diagnosis and prognosis of the phenomena involved U(X) were calculated for choosing informative vector-predictor. We used the criterion of distance of Mahalanobis and criterion of minimum of entropy by Vapnik-Chervonenkis for the selection predictors. Successful development of hydrodynamic models for short-term forecast and improvement of 36-48h forecasts of pressure, temperature and others parameters allowed us to use the prognostic fields of those models for calculations of the discriminant functions in the nodes of the grid 150x150km and the values of probabilities P of dangerous wind and thus to get fully automated forecasts. In order to change to the alternative forecast the author proposes the empirical threshold values specified for this phenomenon and advance period 36 hours. In the accordance to the Pirsey-Obukhov criterion (T), the success of these automated statistical methods of forecast of squalls and tornadoes to 36 -48 hours ahead and heavy rainfalls in the warm season for the territory of Italy, Spain and Balkan countries is T = 1-a-b=0,54: 0,78 after author experiments. A lot of examples of very successful forecasts of summer storm wind and heavy rainfalls over the Italy and Spain territory are submitted at this report. The same decisive rules were applied to the forecast of these phenomena during cold period in this year too. This winter heavy snowfalls in Spain and in Italy and storm wind at this territory were observed very often. And our forecasts are successful.

  20. Low-complexity stochastic modeling of wall-bounded shear flows

    NASA Astrophysics Data System (ADS)

    Zare, Armin

    Turbulent flows are ubiquitous in nature and they appear in many engineering applications. Transition to turbulence, in general, increases skin-friction drag in air/water vehicles compromising their fuel-efficiency and reduces the efficiency and longevity of wind turbines. While traditional flow control techniques combine physical intuition with costly experiments, their effectiveness can be significantly enhanced by control design based on low-complexity models and optimization. In this dissertation, we develop a theoretical and computational framework for the low-complexity stochastic modeling of wall-bounded shear flows. Part I of the dissertation is devoted to the development of a modeling framework which incorporates data-driven techniques to refine physics-based models. We consider the problem of completing partially known sample statistics in a way that is consistent with underlying stochastically driven linear dynamics. Neither the statistics nor the dynamics are precisely known. Thus, our objective is to reconcile the two in a parsimonious manner. To this end, we formulate optimization problems to identify the dynamics and directionality of input excitation in order to explain and complete available covariance data. For problem sizes that general-purpose solvers cannot handle, we develop customized optimization algorithms based on alternating direction methods. The solution to the optimization problem provides information about critical directions that have maximal effect in bringing model and statistics in agreement. In Part II, we employ our modeling framework to account for statistical signatures of turbulent channel flow using low-complexity stochastic dynamical models. We demonstrate that white-in-time stochastic forcing is not sufficient to explain turbulent flow statistics and develop models for colored-in-time forcing of the linearized Navier-Stokes equations. We also examine the efficacy of stochastically forced linearized NS equations and their parabolized equivalents in the receptivity analysis of velocity fluctuations to external sources of excitation as well as capturing the effect of the slowly-varying base flow on streamwise streaks and Tollmien-Schlichting waves. In Part III, we develop a model-based approach to design surface actuation of turbulent channel flow in the form of streamwise traveling waves. This approach is capable of identifying the drag reducing trends of traveling waves in a simulation-free manner. We also use the stochastically forced linearized NS equations to examine the Reynolds number independent effects of spanwise wall oscillations on drag reduction in turbulent channel flows. This allows us to extend the predictive capability of our simulation-free approach to high Reynolds numbers.

Top