Sample records for statistical error analysis

  1. Improved Statistics for Genome-Wide Interaction Analysis

    PubMed Central

    Ueki, Masao; Cordell, Heather J.

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670

  2. Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain

    PubMed Central

    Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young

    2010-01-01

    Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071

  3. Linearised and non-linearised isotherm models optimization analysis by error functions and statistical means

    PubMed Central

    2014-01-01

    In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm. PMID:25018878

  4. Normality Tests for Statistical Analysis: A Guide for Non-Statisticians

    PubMed Central

    Ghasemi, Asghar; Zahediasl, Saleh

    2012-01-01

    Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808

  5. Scaled test statistics and robust standard errors for non-normal data in covariance structure analysis: a Monte Carlo study.

    PubMed

    Chou, C P; Bentler, P M; Satorra, A

    1991-11-01

    Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.

  6. Analysis of uncertainties and convergence of the statistical quantities in turbulent wall-bounded flows by means of a physically based criterion

    NASA Astrophysics Data System (ADS)

    Andrade, João Rodrigo; Martins, Ramon Silva; Thompson, Roney Leon; Mompean, Gilmar; da Silveira Neto, Aristeu

    2018-04-01

    The present paper provides an analysis of the statistical uncertainties associated with direct numerical simulation (DNS) results and experimental data for turbulent channel and pipe flows, showing a new physically based quantification of these errors, to improve the determination of the statistical deviations between DNSs and experiments. The analysis is carried out using a recently proposed criterion by Thompson et al. ["A methodology to evaluate statistical errors in DNS data of plane channel flows," Comput. Fluids 130, 1-7 (2016)] for fully turbulent plane channel flows, where the mean velocity error is estimated by considering the Reynolds stress tensor, and using the balance of the mean force equation. It also presents how the residual error evolves in time for a DNS of a plane channel flow, and the influence of the Reynolds number on its convergence rate. The root mean square of the residual error is shown in order to capture a single quantitative value of the error associated with the dimensionless averaging time. The evolution in time of the error norm is compared with the final error provided by DNS data of similar Reynolds numbers available in the literature. A direct consequence of this approach is that it was possible to compare different numerical results and experimental data, providing an improved understanding of the convergence of the statistical quantities in turbulent wall-bounded flows.

  7. [Character of refractive errors in population study performed by the Area Military Medical Commission in Lodz].

    PubMed

    Nowak, Michał S; Goś, Roman; Smigielski, Janusz

    2008-01-01

    To determine the prevalence of refractive errors in population. A retrospective review of medical examinations for entry to the military service from The Area Military Medical Commission in Lodz. Ophthalmic examinations were performed. We used statistic analysis to review the results. Statistic analysis revealed that refractive errors occurred in 21.68% of the population. The most commen refractive error was myopia. 1) The most commen ocular diseases are refractive errors, especially myopia (21.68% in total). 2) Refractive surgery and contact lenses should be allowed as the possible correction of refractive errors for military service.

  8. Evaluation of assumptions in soil moisture triple collocation analysis

    USDA-ARS?s Scientific Manuscript database

    Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually-independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and ort...

  9. Statistical model to perform error analysis of curve fits of wind tunnel test data using the techniques of analysis of variance and regression analysis

    NASA Technical Reports Server (NTRS)

    Alston, D. W.

    1981-01-01

    The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.

  10. Effects of measurement errors on psychometric measurements in ergonomics studies: Implications for correlations, ANOVA, linear regression, factor analysis, and linear discriminant analysis.

    PubMed

    Liu, Yan; Salvendy, Gavriel

    2009-05-01

    This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.

  11. Microscopic saw mark analysis: an empirical approach.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2015-01-01

    Microscopic saw mark analysis is a well published and generally accepted qualitative analytical method. However, little research has focused on identifying and mitigating potential sources of error associated with the method. The presented study proposes the use of classification trees and random forest classifiers as an optimal, statistically sound approach to mitigate the potential for error of variability and outcome error in microscopic saw mark analysis. The statistical model was applied to 58 experimental saw marks created with four types of saws. The saw marks were made in fresh human femurs obtained through anatomical gift and were analyzed using a Keyence digital microscope. The statistical approach weighed the variables based on discriminatory value and produced decision trees with an associated outcome error rate of 8.62-17.82%. © 2014 American Academy of Forensic Sciences.

  12. Statistical Analysis Experiment for Freshman Chemistry Lab.

    ERIC Educational Resources Information Center

    Salzsieder, John C.

    1995-01-01

    Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…

  13. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

    PubMed

    Lin, Johnny; Bentler, Peter M

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.

  14. Meta-analysis inside and outside particle physics: two traditions that should converge?

    PubMed

    Baker, Rose D; Jackson, Dan

    2013-06-01

    The use of meta-analysis in medicine and epidemiology really took off in the 1970s. However, in high-energy physics, the Particle Data Group has been carrying out meta-analyses of measurements of particle masses and other properties since 1957. Curiously, there has been virtually no interaction between those working inside and outside particle physics. In this paper, we use statistical models to study two major differences in practice. The first is the usefulness of systematic errors, which physicists are now beginning to quote in addition to statistical errors. The second is whether it is better to treat heterogeneity by scaling up errors as do the Particle Data Group or by adding a random effect as does the rest of the community. Besides fitting models, we derive and use an exact test of the error-scaling hypothesis. We also discuss the other methodological differences between the two streams of meta-analysis. Our conclusion is that systematic errors are not currently very useful and that the conventional random effects model, as routinely used in meta-analysis, has a useful role to play in particle physics. The moral we draw for statisticians is that we should be more willing to explore 'grassroots' areas of statistical application, so that good statistical practice can flow both from and back to the statistical mainstream. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.

  15. Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization

    NASA Technical Reports Server (NTRS)

    Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred

    2014-01-01

    In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.

  16. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis

    PubMed Central

    Lin, Johnny; Bentler, Peter M.

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne’s asymptotically distribution-free method and Satorra Bentler’s mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler’s statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby’s study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic. PMID:23144511

  17. Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.

    PubMed

    Veldkamp, Coosje L S; Nuijten, Michèle B; Dominguez-Alvarez, Linda; van Assen, Marcel A L M; Wicherts, Jelte M

    2014-01-01

    Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this 'co-piloting' currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.

  18. Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science

    PubMed Central

    Veldkamp, Coosje L. S.; Nuijten, Michèle B.; Dominguez-Alvarez, Linda; van Assen, Marcel A. L. M.; Wicherts, Jelte M.

    2014-01-01

    Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this ‘co-piloting’ currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors. PMID:25493918

  19. Scout trajectory error propagation computer program

    NASA Technical Reports Server (NTRS)

    Myler, T. R.

    1982-01-01

    Since 1969, flight experience has been used as the basis for predicting Scout orbital accuracy. The data used for calculating the accuracy consists of errors in the trajectory parameters (altitude, velocity, etc.) at stage burnout as observed on Scout flights. Approximately 50 sets of errors are used in Monte Carlo analysis to generate error statistics in the trajectory parameters. A covariance matrix is formed which may be propagated in time. The mechanization of this process resulted in computer program Scout Trajectory Error Propagation (STEP) and is described herein. Computer program STEP may be used in conjunction with the Statistical Orbital Analysis Routine to generate accuracy in the orbit parameters (apogee, perigee, inclination, etc.) based upon flight experience.

  20. Statistical analysis of modeling error in structural dynamic systems

    NASA Technical Reports Server (NTRS)

    Hasselman, T. K.; Chrostowski, J. D.

    1990-01-01

    The paper presents a generic statistical model of the (total) modeling error for conventional space structures in their launch configuration. Modeling error is defined as the difference between analytical prediction and experimental measurement. It is represented by the differences between predicted and measured real eigenvalues and eigenvectors. Comparisons are made between pre-test and post-test models. Total modeling error is then subdivided into measurement error, experimental error and 'pure' modeling error, and comparisons made between measurement error and total modeling error. The generic statistical model presented in this paper is based on the first four global (primary structure) modes of four different structures belonging to the generic category of Conventional Space Structures (specifically excluding large truss-type space structures). As such, it may be used to evaluate the uncertainty of predicted mode shapes and frequencies, sinusoidal response, or the transient response of other structures belonging to the same generic category.

  1. Data Analysis & Statistical Methods for Command File Errors

    NASA Technical Reports Server (NTRS)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  2. Trial Sequential Analysis in systematic reviews with meta-analysis.

    PubMed

    Wetterslev, Jørn; Jakobsen, Janus Christian; Gluud, Christian

    2017-03-06

    Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors). We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D 2 ) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis. Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.

  3. The GEOS Ozone Data Assimilation System: Specification of Error Statistics

    NASA Technical Reports Server (NTRS)

    Stajner, Ivanka; Riishojgaard, Lars Peter; Rood, Richard B.

    2000-01-01

    A global three-dimensional ozone data assimilation system has been developed at the Data Assimilation Office of the NASA/Goddard Space Flight Center. The Total Ozone Mapping Spectrometer (TOMS) total ozone and the Solar Backscatter Ultraviolet (SBUV) or (SBUV/2) partial ozone profile observations are assimilated. The assimilation, into an off-line ozone transport model, is done using the global Physical-space Statistical Analysis Scheme (PSAS). This system became operational in December 1999. A detailed description of the statistical analysis scheme, and in particular, the forecast and observation error covariance models is given. A new global anisotropic horizontal forecast error correlation model accounts for a varying distribution of observations with latitude. Correlations are largest in the zonal direction in the tropics where data is sparse. Forecast error variance model is proportional to the ozone field. The forecast error covariance parameters were determined by maximum likelihood estimation. The error covariance models are validated using x squared statistics. The analyzed ozone fields in the winter 1992 are validated against independent observations from ozone sondes and HALOE. There is better than 10% agreement between mean Halogen Occultation Experiment (HALOE) and analysis fields between 70 and 0.2 hPa. The global root-mean-square (RMS) difference between TOMS observed and forecast values is less than 4%. The global RMS difference between SBUV observed and analyzed ozone between 50 and 3 hPa is less than 15%.

  4. Water quality management using statistical analysis and time-series prediction model

    NASA Astrophysics Data System (ADS)

    Parmar, Kulwinder Singh; Bhardwaj, Rashmi

    2014-12-01

    This paper deals with water quality management using statistical analysis and time-series prediction model. The monthly variation of water quality standards has been used to compare statistical mean, median, mode, standard deviation, kurtosis, skewness, coefficient of variation at Yamuna River. Model validated using R-squared, root mean square error, mean absolute percentage error, maximum absolute percentage error, mean absolute error, maximum absolute error, normalized Bayesian information criterion, Ljung-Box analysis, predicted value and confidence limits. Using auto regressive integrated moving average model, future water quality parameters values have been estimated. It is observed that predictive model is useful at 95 % confidence limits and curve is platykurtic for potential of hydrogen (pH), free ammonia, total Kjeldahl nitrogen, dissolved oxygen, water temperature (WT); leptokurtic for chemical oxygen demand, biochemical oxygen demand. Also, it is observed that predicted series is close to the original series which provides a perfect fit. All parameters except pH and WT cross the prescribed limits of the World Health Organization /United States Environmental Protection Agency, and thus water is not fit for drinking, agriculture and industrial use.

  5. Application of Statistics in Engineering Technology Programs

    ERIC Educational Resources Information Center

    Zhan, Wei; Fink, Rainer; Fang, Alex

    2010-01-01

    Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. Traditionally, however, statistics is not extensively used in undergraduate engineering technology (ET) programs, resulting in a major disconnect from industry…

  6. Error Analysis for RADAR Neighbor Matching Localization in Linear Logarithmic Strength Varying Wi-Fi Environment

    PubMed Central

    Tian, Zengshan; Xu, Kunjie; Yu, Xiang

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future. PMID:24683349

  7. Error analysis for RADAR neighbor matching localization in linear logarithmic strength varying Wi-Fi environment.

    PubMed

    Zhou, Mu; Tian, Zengshan; Xu, Kunjie; Yu, Xiang; Wu, Haibo

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future.

  8. WASP (Write a Scientific Paper) using Excel - 6: Standard error and confidence interval.

    PubMed

    Grech, Victor

    2018-03-01

    The calculation of descriptive statistics includes the calculation of standard error and confidence interval, an inevitable component of data analysis in inferential statistics. This paper provides pointers as to how to do this in Microsoft Excel™. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Common Scientific and Statistical Errors in Obesity Research

    PubMed Central

    George, Brandon J.; Beasley, T. Mark; Brown, Andrew W.; Dawson, John; Dimova, Rositsa; Divers, Jasmin; Goldsby, TaShauna U.; Heo, Moonseong; Kaiser, Kathryn A.; Keith, Scott; Kim, Mimi Y.; Li, Peng; Mehta, Tapan; Oakes, J. Michael; Skinner, Asheley; Stuart, Elizabeth; Allison, David B.

    2015-01-01

    We identify 10 common errors and problems in the statistical analysis, design, interpretation, and reporting of obesity research and discuss how they can be avoided. The 10 topics are: 1) misinterpretation of statistical significance, 2) inappropriate testing against baseline values, 3) excessive and undisclosed multiple testing and “p-value hacking,” 4) mishandling of clustering in cluster randomized trials, 5) misconceptions about nonparametric tests, 6) mishandling of missing data, 7) miscalculation of effect sizes, 8) ignoring regression to the mean, 9) ignoring confirmation bias, and 10) insufficient statistical reporting. We hope that discussion of these errors can improve the quality of obesity research by helping researchers to implement proper statistical practice and to know when to seek the help of a statistician. PMID:27028280

  10. The Effects of Measurement Error on Statistical Models for Analyzing Change. Final Report.

    ERIC Educational Resources Information Center

    Dunivant, Noel

    The results of six major projects are discussed including a comprehensive mathematical and statistical analysis of the problems caused by errors of measurement in linear models for assessing change. In a general matrix representation of the problem, several new analytic results are proved concerning the parameters which affect bias in…

  11. Student Distractor Choices on the Mathematics Virginia Standards of Learning Middle School Assessments

    ERIC Educational Resources Information Center

    Lewis, Virginia Vimpeny

    2011-01-01

    Number Concepts; Measurement; Geometry; Probability; Statistics; and Patterns, Functions and Algebra. Procedural Errors were further categorized into the following content categories: Computation; Measurement; Statistics; and Patterns, Functions, and Algebra. The results of the analysis showed the main sources of error for 6th, 7th, and 8th…

  12. The role of ensemble-based statistics in variational assimilation of cloud-affected observations from infrared imagers

    NASA Astrophysics Data System (ADS)

    Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris

    2017-04-01

    Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the apparent benefit from using ensembles for cloudy radiance assimilation in an EnVar context.

  13. Visual Data Analysis for Satellites

    NASA Technical Reports Server (NTRS)

    Lau, Yee; Bhate, Sachin; Fitzpatrick, Patrick

    2008-01-01

    The Visual Data Analysis Package is a collection of programs and scripts that facilitate visual analysis of data available from NASA and NOAA satellites, as well as dropsonde, buoy, and conventional in-situ observations. The package features utilities for data extraction, data quality control, statistical analysis, and data visualization. The Hierarchical Data Format (HDF) satellite data extraction routines from NASA's Jet Propulsion Laboratory were customized for specific spatial coverage and file input/output. Statistical analysis includes the calculation of the relative error, the absolute error, and the root mean square error. Other capabilities include curve fitting through the data points to fill in missing data points between satellite passes or where clouds obscure satellite data. For data visualization, the software provides customizable Generic Mapping Tool (GMT) scripts to generate difference maps, scatter plots, line plots, vector plots, histograms, timeseries, and color fill images.

  14. BaTMAn: Bayesian Technique for Multi-image Analysis

    NASA Astrophysics Data System (ADS)

    Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.

    2016-12-01

    Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.

  15. Empirical performance of interpolation techniques in risk-neutral density (RND) estimation

    NASA Astrophysics Data System (ADS)

    Bahaludin, H.; Abdullah, M. H.

    2017-03-01

    The objective of this study is to evaluate the empirical performance of interpolation techniques in risk-neutral density (RND) estimation. Firstly, the empirical performance is evaluated by using statistical analysis based on the implied mean and the implied variance of RND. Secondly, the interpolation performance is measured based on pricing error. We propose using the leave-one-out cross-validation (LOOCV) pricing error for interpolation selection purposes. The statistical analyses indicate that there are statistical differences between the interpolation techniques:second-order polynomial, fourth-order polynomial and smoothing spline. The results of LOOCV pricing error shows that interpolation by using fourth-order polynomial provides the best fitting to option prices in which it has the lowest value error.

  16. Statistical process control methods allow the analysis and improvement of anesthesia care.

    PubMed

    Fasting, Sigurd; Gisvold, Sven E

    2003-10-01

    Quality aspects of the anesthetic process are reflected in the rate of intraoperative adverse events. The purpose of this report is to illustrate how the quality of the anesthesia process can be analyzed using statistical process control methods, and exemplify how this analysis can be used for quality improvement. We prospectively recorded anesthesia-related data from all anesthetics for five years. The data included intraoperative adverse events, which were graded into four levels, according to severity. We selected four adverse events, representing important quality and safety aspects, for statistical process control analysis. These were: inadequate regional anesthesia, difficult emergence from general anesthesia, intubation difficulties and drug errors. We analyzed the underlying process using 'p-charts' for statistical process control. In 65,170 anesthetics we recorded adverse events in 18.3%; mostly of lesser severity. Control charts were used to define statistically the predictable normal variation in problem rate, and then used as a basis for analysis of the selected problems with the following results: Inadequate plexus anesthesia: stable process, but unacceptably high failure rate; Difficult emergence: unstable process, because of quality improvement efforts; Intubation difficulties: stable process, rate acceptable; Medication errors: methodology not suited because of low rate of errors. By applying statistical process control methods to the analysis of adverse events, we have exemplified how this allows us to determine if a process is stable, whether an intervention is required, and if quality improvement efforts have the desired effect.

  17. Impact and quantification of the sources of error in DNA pooling designs.

    PubMed

    Jawaid, A; Sham, P

    2009-01-01

    The analysis of genome wide variation offers the possibility of unravelling the genes involved in the pathogenesis of disease. Genome wide association studies are also particularly useful for identifying and validating targets for therapeutic intervention as well as for detecting markers for drug efficacy and side effects. The cost of such large-scale genetic association studies may be reduced substantially by the analysis of pooled DNA from multiple individuals. However, experimental errors inherent in pooling studies lead to a potential increase in the false positive rate and a loss in power compared to individual genotyping. Here we quantify various sources of experimental error using empirical data from typical pooling experiments and corresponding individual genotyping counts using two statistical methods. We provide analytical formulas for calculating these different errors in the absence of complete information, such as replicate pool formation, and for adjusting for the errors in the statistical analysis. We demonstrate that DNA pooling has the potential of estimating allele frequencies accurately, and adjusting the pooled allele frequency estimates for differential allelic amplification considerably improves accuracy. Estimates of the components of error show that differential allelic amplification is the most important contributor to the error variance in absolute allele frequency estimation, followed by allele frequency measurement and pool formation errors. Our results emphasise the importance of minimising experimental errors and obtaining correct error estimates in genetic association studies.

  18. Visual Survey of Infantry Troops. Part 1. Visual Acuity, Refractive Status, Interpupillary Distance and Visual Skills

    DTIC Science & Technology

    1989-06-01

    letters on one line and several letters on the next line, there is no accurate way to credit these extra letters for statistical analysis. The decimal and...contains the descriptive statistics of the objective refractive error components of infantrymen. Figures 8-11 show the frequency distributions for sphere...equivalents. Nonspectacle wearers Table 12 contains the idescriptive statistics for non- spectacle wearers. Based or these refractive error data, about 30

  19. Implementation of an experimental program to investigate the performance characteristics of OMEGA navigation

    NASA Technical Reports Server (NTRS)

    Baxa, E. G., Jr.

    1974-01-01

    A theoretical formulation of differential and composite OMEGA error is presented to establish hypotheses about the functional relationships between various parameters and OMEGA navigational errors. Computer software developed to provide for extensive statistical analysis of the phase data is described. Results from the regression analysis used to conduct parameter sensitivity studies on differential OMEGA error tend to validate the theoretically based hypothesis concerning the relationship between uncorrected differential OMEGA error and receiver separation range and azimuth. Limited results of measurement of receiver repeatability error and line of position measurement error are also presented.

  20. Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study.

    PubMed

    Nour-Eldein, Hebatallah

    2016-01-01

    With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles.

  1. Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study

    PubMed Central

    Nour-Eldein, Hebatallah

    2016-01-01

    Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839

  2. A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

    ERIC Educational Resources Information Center

    Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.

    2010-01-01

    This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…

  3. The Importance of Statistical Modeling in Data Analysis and Inference

    ERIC Educational Resources Information Center

    Rollins, Derrick, Sr.

    2017-01-01

    Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…

  4. Analysis of S-box in Image Encryption Using Root Mean Square Error Method

    NASA Astrophysics Data System (ADS)

    Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan

    2012-07-01

    The use of substitution boxes (S-boxes) in encryption applications has proven to be an effective nonlinear component in creating confusion and randomness. The S-box is evolving and many variants appear in literature, which include advanced encryption standard (AES) S-box, affine power affine (APA) S-box, Skipjack S-box, Gray S-box, Lui J S-box, residue prime number S-box, Xyi S-box, and S8 S-box. These S-boxes have algebraic and statistical properties which distinguish them from each other in terms of encryption strength. In some circumstances, the parameters from algebraic and statistical analysis yield results which do not provide clear evidence in distinguishing an S-box for an application to a particular set of data. In image encryption applications, the use of S-boxes needs special care because the visual analysis and perception of a viewer can sometimes identify artifacts embedded in the image. In addition to existing algebraic and statistical analysis already used for image encryption applications, we propose an application of root mean square error technique, which further elaborates the results and enables the analyst to vividly distinguish between the performances of various S-boxes. While the use of the root mean square error analysis in statistics has proven to be effective in determining the difference in original data and the processed data, its use in image encryption has shown promising results in estimating the strength of the encryption method. In this paper, we show the application of the root mean square error analysis to S-box image encryption. The parameters from this analysis are used in determining the strength of S-boxes

  5. Accounting for measurement error: a critical but often overlooked process.

    PubMed

    Harris, Edward F; Smith, Richard N

    2009-12-01

    Due to instrument imprecision and human inconsistencies, measurements are not free of error. Technical error of measurement (TEM) is the variability encountered between dimensions when the same specimens are measured at multiple sessions. A goal of a data collection regimen is to minimise TEM. The few studies that actually quantify TEM, regardless of discipline, report that it is substantial and can affect results and inferences. This paper reviews some statistical approaches for identifying and controlling TEM. Statistically, TEM is part of the residual ('unexplained') variance in a statistical test, so accounting for TEM, which requires repeated measurements, enhances the chances of finding a statistically significant difference if one exists. The aim of this paper was to review and discuss common statistical designs relating to types of error and statistical approaches to error accountability. This paper addresses issues of landmark location, validity, technical and systematic error, analysis of variance, scaled measures and correlation coefficients in order to guide the reader towards correct identification of true experimental differences. Researchers commonly infer characteristics about populations from comparatively restricted study samples. Most inferences are statistical and, aside from concerns about adequate accounting for known sources of variation with the research design, an important source of variability is measurement error. Variability in locating landmarks that define variables is obvious in odontometrics, cephalometrics and anthropometry, but the same concerns about measurement accuracy and precision extend to all disciplines. With increasing accessibility to computer-assisted methods of data collection, the ease of incorporating repeated measures into statistical designs has improved. Accounting for this technical source of variation increases the chance of finding biologically true differences when they exist.

  6. Review of U.S. Army Unmanned Aerial Systems Accident Reports: Analysis of Human Error Contributions

    DTIC Science & Technology

    2018-03-20

    USAARL Report No. 2018-08 Review of U.S. Army Unmanned Aerial Systems Accident Reports: Analysis of Human Error Contributions By Kathryn A...3 Statistical Analysis Approach ..............................................................................................3 Results...1 Introduction The success of unmanned aerial systems (UAS) operations relies upon a variety of factors, including, but not limited to

  7. The Influence of Observation Errors on Analysis Error and Forecast Skill Investigated with an Observing System Simulation Experiment

    NASA Technical Reports Server (NTRS)

    Prive, N. C.; Errico, R. M.; Tai, K.-S.

    2013-01-01

    The Global Modeling and Assimilation Office (GMAO) observing system simulation experiment (OSSE) framework is used to explore the response of analysis error and forecast skill to observation quality. In an OSSE, synthetic observations may be created that have much smaller error than real observations, and precisely quantified error may be applied to these synthetic observations. Three experiments are performed in which synthetic observations with magnitudes of applied observation error that vary from zero to twice the estimated realistic error are ingested into the Goddard Earth Observing System Model (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation for a one-month period representing July. The analysis increment and observation innovation are strongly impacted by observation error, with much larger variances for increased observation error. The analysis quality is degraded by increased observation error, but the change in root-mean-square error of the analysis state is small relative to the total analysis error. Surprisingly, in the 120 hour forecast increased observation error only yields a slight decline in forecast skill in the extratropics, and no discernable degradation of forecast skill in the tropics.

  8. Analysis of basic clustering algorithms for numerical estimation of statistical averages in biomolecules.

    PubMed

    Anandakrishnan, Ramu; Onufriev, Alexey

    2008-03-01

    In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.

  9. Robustness of Type I Error and Power in Set Correlation Analysis of Contingency Tables.

    ERIC Educational Resources Information Center

    Cohen, Jacob; Nee, John C. M.

    1990-01-01

    The analysis of contingency tables via set correlation allows the assessment of subhypotheses involving contrast functions of the categories of the nominal scales. The robustness of such methods with regard to Type I error and statistical power was studied via a Monte Carlo experiment. (TJH)

  10. Systematic review of statistical approaches to quantify, or correct for, measurement error in a continuous exposure in nutritional epidemiology.

    PubMed

    Bennett, Derrick A; Landry, Denise; Little, Julian; Minelli, Cosetta

    2017-09-19

    Several statistical approaches have been proposed to assess and correct for exposure measurement error. We aimed to provide a critical overview of the most common approaches used in nutritional epidemiology. MEDLINE, EMBASE, BIOSIS and CINAHL were searched for reports published in English up to May 2016 in order to ascertain studies that described methods aimed to quantify and/or correct for measurement error for a continuous exposure in nutritional epidemiology using a calibration study. We identified 126 studies, 43 of which described statistical methods and 83 that applied any of these methods to a real dataset. The statistical approaches in the eligible studies were grouped into: a) approaches to quantify the relationship between different dietary assessment instruments and "true intake", which were mostly based on correlation analysis and the method of triads; b) approaches to adjust point and interval estimates of diet-disease associations for measurement error, mostly based on regression calibration analysis and its extensions. Two approaches (multiple imputation and moment reconstruction) were identified that can deal with differential measurement error. For regression calibration, the most common approach to correct for measurement error used in nutritional epidemiology, it is crucial to ensure that its assumptions and requirements are fully met. Analyses that investigate the impact of departures from the classical measurement error model on regression calibration estimates can be helpful to researchers in interpreting their findings. With regard to the possible use of alternative methods when regression calibration is not appropriate, the choice of method should depend on the measurement error model assumed, the availability of suitable calibration study data and the potential for bias due to violation of the classical measurement error model assumptions. On the basis of this review, we provide some practical advice for the use of methods to assess and adjust for measurement error in nutritional epidemiology.

  11. Local indicators of geocoding accuracy (LIGA): theory and application

    PubMed Central

    Jacquez, Geoffrey M; Rommel, Robert

    2009-01-01

    Background Although sources of positional error in geographic locations (e.g. geocoding error) used for describing and modeling spatial patterns are widely acknowledged, research on how such error impacts the statistical results has been limited. In this paper we explore techniques for quantifying the perturbability of spatial weights to different specifications of positional error. Results We find that a family of curves describes the relationship between perturbability and positional error, and use these curves to evaluate sensitivity of alternative spatial weight specifications to positional error both globally (when all locations are considered simultaneously) and locally (to identify those locations that would benefit most from increased geocoding accuracy). We evaluate the approach in simulation studies, and demonstrate it using a case-control study of bladder cancer in south-eastern Michigan. Conclusion Three results are significant. First, the shape of the probability distributions of positional error (e.g. circular, elliptical, cross) has little impact on the perturbability of spatial weights, which instead depends on the mean positional error. Second, our methodology allows researchers to evaluate the sensitivity of spatial statistics to positional accuracy for specific geographies. This has substantial practical implications since it makes possible routine sensitivity analysis of spatial statistics to positional error arising in geocoded street addresses, global positioning systems, LIDAR and other geographic data. Third, those locations with high perturbability (most sensitive to positional error) and high leverage (that contribute the most to the spatial weight being considered) will benefit the most from increased positional accuracy. These are rapidly identified using a new visualization tool we call the LIGA scatterplot. Herein lies a paradox for spatial analysis: For a given level of positional error increasing sample density to more accurately follow the underlying population distribution increases perturbability and introduces error into the spatial weights matrix. In some studies positional error may not impact the statistical results, and in others it might invalidate the results. We therefore must understand the relationships between positional accuracy and the perturbability of the spatial weights in order to have confidence in a study's results. PMID:19863795

  12. Phase error statistics of a phase-locked loop synchronized direct detection optical PPM communication system

    NASA Technical Reports Server (NTRS)

    Natarajan, Suresh; Gardner, C. S.

    1987-01-01

    Receiver timing synchronization of an optical Pulse-Position Modulation (PPM) communication system can be achieved using a phased-locked loop (PLL), provided the photodetector output is suitably processed. The magnitude of the PLL phase error is a good indicator of the timing error at the receiver decoder. The statistics of the phase error are investigated while varying several key system parameters such as PPM order, signal and background strengths, and PPL bandwidth. A practical optical communication system utilizing a laser diode transmitter and an avalanche photodiode in the receiver is described, and the sampled phase error data are presented. A linear regression analysis is applied to the data to obtain estimates of the relational constants involving the phase error variance and incident signal power.

  13. Statistical model specification and power: recommendations on the use of test-qualified pooling in analysis of experimental data

    PubMed Central

    Colegrave, Nick

    2017-01-01

    A common approach to the analysis of experimental data across much of the biological sciences is test-qualified pooling. Here non-significant terms are dropped from a statistical model, effectively pooling the variation associated with each removed term with the error term used to test hypotheses (or estimate effect sizes). This pooling is only carried out if statistical testing on the basis of applying that data to a previous more complicated model provides motivation for this model simplification; hence the pooling is test-qualified. In pooling, the researcher increases the degrees of freedom of the error term with the aim of increasing statistical power to test their hypotheses of interest. Despite this approach being widely adopted and explicitly recommended by some of the most widely cited statistical textbooks aimed at biologists, here we argue that (except in highly specialized circumstances that we can identify) the hoped-for improvement in statistical power will be small or non-existent, and there is likely to be much reduced reliability of the statistical procedures through deviation of type I error rates from nominal levels. We thus call for greatly reduced use of test-qualified pooling across experimental biology, more careful justification of any use that continues, and a different philosophy for initial selection of statistical models in the light of this change in procedure. PMID:28330912

  14. Analysis of measured data of human body based on error correcting frequency

    NASA Astrophysics Data System (ADS)

    Jin, Aiyan; Peipei, Gao; Shang, Xiaomei

    2014-04-01

    Anthropometry is to measure all parts of human body surface, and the measured data is the basis of analysis and study of the human body, establishment and modification of garment size and formulation and implementation of online clothing store. In this paper, several groups of the measured data are gained, and analysis of data error is gotten by analyzing the error frequency and using analysis of variance method in mathematical statistics method. Determination of the measured data accuracy and the difficulty of measured parts of human body, further studies of the causes of data errors, and summarization of the key points to minimize errors possibly are also mentioned in the paper. This paper analyses the measured data based on error frequency, and in a way , it provides certain reference elements to promote the garment industry development.

  15. Investigating the Relationship between Conceptual and Procedural Errors in the Domain of Probability Problem-Solving.

    ERIC Educational Resources Information Center

    O'Connell, Ann Aileen

    The relationships among types of errors observed during probability problem solving were studied. Subjects were 50 graduate students in an introductory probability and statistics course. Errors were classified as text comprehension, conceptual, procedural, and arithmetic. Canonical correlation analysis was conducted on the frequencies of specific…

  16. A Unified Approach to Measurement Error and Missing Data: Overview and Applications

    ERIC Educational Resources Information Center

    Blackwell, Matthew; Honaker, James; King, Gary

    2017-01-01

    Although social scientists devote considerable effort to mitigating measurement error during data collection, they often ignore the issue during data analysis. And although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because of implausible assumptions, high levels of model…

  17. Patient safety in the clinical laboratory: a longitudinal analysis of specimen identification errors.

    PubMed

    Wagar, Elizabeth A; Tamashiro, Lorraine; Yasin, Bushra; Hilborne, Lee; Bruckner, David A

    2006-11-01

    Patient safety is an increasingly visible and important mission for clinical laboratories. Attention to improving processes related to patient identification and specimen labeling is being paid by accreditation and regulatory organizations because errors in these areas that jeopardize patient safety are common and avoidable through improvement in the total testing process. To assess patient identification and specimen labeling improvement after multiple implementation projects using longitudinal statistical tools. Specimen errors were categorized by a multidisciplinary health care team. Patient identification errors were grouped into 3 categories: (1) specimen/requisition mismatch, (2) unlabeled specimens, and (3) mislabeled specimens. Specimens with these types of identification errors were compared preimplementation and postimplementation for 3 patient safety projects: (1) reorganization of phlebotomy (4 months); (2) introduction of an electronic event reporting system (10 months); and (3) activation of an automated processing system (14 months) for a 24-month period, using trend analysis and Student t test statistics. Of 16,632 total specimen errors, mislabeled specimens, requisition mismatches, and unlabeled specimens represented 1.0%, 6.3%, and 4.6% of errors, respectively. Student t test showed a significant decrease in the most serious error, mislabeled specimens (P < .001) when compared to before implementation of the 3 patient safety projects. Trend analysis demonstrated decreases in all 3 error types for 26 months. Applying performance-improvement strategies that focus longitudinally on specimen labeling errors can significantly reduce errors, therefore improving patient safety. This is an important area in which laboratory professionals, working in interdisciplinary teams, can improve safety and outcomes of care.

  18. Bootstrap Methods: A Very Leisurely Look.

    ERIC Educational Resources Information Center

    Hinkle, Dennis E.; Winstead, Wayland H.

    The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…

  19. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  20. Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

    PubMed

    Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

    2005-01-01

    Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.

  1. Analysis of variance to assess statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes.

    PubMed

    Makeyev, Oleksandr; Joe, Cody; Lee, Colin; Besio, Walter G

    2017-07-01

    Concentric ring electrodes have shown promise in non-invasive electrophysiological measurement demonstrating their superiority to conventional disc electrodes, in particular, in accuracy of Laplacian estimation. Recently, we have proposed novel variable inter-ring distances concentric ring electrodes. Analytic and finite element method modeling results for linearly increasing distances electrode configurations suggested they may decrease the truncation error resulting in more accurate Laplacian estimates compared to currently used constant inter-ring distances configurations. This study assesses statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes. Full factorial design of analysis of variance was used with one categorical and two numerical factors: the inter-ring distances, the electrode diameter, and the number of concentric rings in the electrode. The response variables were the Relative Error and the Maximum Error of Laplacian estimation computed using a finite element method model for each of the combinations of levels of three factors. Effects of the main factors and their interactions on Relative Error and Maximum Error were assessed and the obtained results suggest that all three factors have statistically significant effects in the model confirming the potential of using inter-ring distances as a means of improving accuracy of Laplacian estimation.

  2. The Statistical Power of Planned Comparisons.

    ERIC Educational Resources Information Center

    Benton, Roberta L.

    Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…

  3. A simple, objective analysis scheme for scatterometer data. [Seasat A satellite observation of wind over ocean

    NASA Technical Reports Server (NTRS)

    Levy, G.; Brown, R. A.

    1986-01-01

    A simple economical objective analysis scheme is devised and tested on real scatterometer data. It is designed to treat dense data such as those of the Seasat A Satellite Scatterometer (SASS) for individual or multiple passes, and preserves subsynoptic scale features. Errors are evaluated with the aid of sampling ('bootstrap') statistical methods. In addition, sensitivity tests have been performed which establish qualitative confidence in calculated fields of divergence and vorticity. The SASS wind algorithm could be improved; however, the data at this point are limited by instrument errors rather than analysis errors. The analysis error is typically negligible in comparison with the instrument error, but amounts to 30 percent of the instrument error in areas of strong wind shear. The scheme is very economical, and thus suitable for large volumes of dense data such as SASS data.

  4. Monitoring sleepiness with on-board electrophysiological recordings for preventing sleep-deprived traffic accidents.

    PubMed

    Papadelis, Christos; Chen, Zhe; Kourtidou-Papadeli, Chrysoula; Bamidis, Panagiotis D; Chouvarda, Ioanna; Bekiaris, Evangelos; Maglaveras, Nikos

    2007-09-01

    The objective of this study is the development and evaluation of efficient neurophysiological signal statistics, which may assess the driver's alertness level and serve as potential indicators of sleepiness in the design of an on-board countermeasure system. Multichannel EEG, EOG, EMG, and ECG were recorded from sleep-deprived subjects exposed to real field driving conditions. A number of severe driving errors occurred during the experiments. The analysis was performed in two main dimensions: the macroscopic analysis that estimates the on-going temporal evolution of physiological measurements during the driving task, and the microscopic event analysis that focuses on the physiological measurements' alterations just before, during, and after the driving errors. Two independent neurophysiologists visually interpreted the measurements. The EEG data were analyzed by using both linear and non-linear analysis tools. We observed the occurrence of brief paroxysmal bursts of alpha activity and an increased synchrony among EEG channels before the driving errors. The alpha relative band ratio (RBR) significantly increased, and the Cross Approximate Entropy that quantifies the synchrony among channels also significantly decreased before the driving errors. Quantitative EEG analysis revealed significant variations of RBR by driving time in the frequency bands of delta, alpha, beta, and gamma. Most of the estimated EEG statistics, such as the Shannon Entropy, Kullback-Leibler Entropy, Coherence, and Cross-Approximate Entropy, were significantly affected by driving time. We also observed an alteration of eyes blinking duration by increased driving time and a significant increase of eye blinks' number and duration before driving errors. EEG and EOG are promising neurophysiological indicators of driver sleepiness and have the potential of monitoring sleepiness in occupational settings incorporated in a sleepiness countermeasure device. The occurrence of brief paroxysmal bursts of alpha activity before severe driving errors is described in detail for the first time. Clear evidence is presented that eye-blinking statistics are sensitive to the driver's sleepiness and should be considered in the design of an efficient and driver-friendly sleepiness detection countermeasure device.

  5. Living systematic reviews: 3. Statistical methods for updating meta-analyses.

    PubMed

    Simmonds, Mark; Salanti, Georgia; McKenzie, Joanne; Elliott, Julian

    2017-11-01

    A living systematic review (LSR) should keep the review current as new research evidence emerges. Any meta-analyses included in the review will also need updating as new material is identified. If the aim of the review is solely to present the best current evidence standard meta-analysis may be sufficient, provided reviewers are aware that results may change at later updates. If the review is used in a decision-making context, more caution may be needed. When using standard meta-analysis methods, the chance of incorrectly concluding that any updated meta-analysis is statistically significant when there is no effect (the type I error) increases rapidly as more updates are performed. Inaccurate estimation of any heterogeneity across studies may also lead to inappropriate conclusions. This paper considers four methods to avoid some of these statistical problems when updating meta-analyses: two methods, that is, law of the iterated logarithm and the Shuster method control primarily for inflation of type I error and two other methods, that is, trial sequential analysis and sequential meta-analysis control for type I and II errors (failing to detect a genuine effect) and take account of heterogeneity. This paper compares the methods and considers how they could be applied to LSRs. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Statistical image quantification toward optimal scan fusion and change quantification

    NASA Astrophysics Data System (ADS)

    Potesil, Vaclav; Zhou, Xiang Sean

    2007-03-01

    Recent advance of imaging technology has brought new challenges and opportunities for automatic and quantitative analysis of medical images. With broader accessibility of more imaging modalities for more patients, fusion of modalities/scans from one time point and longitudinal analysis of changes across time points have become the two most critical differentiators to support more informed, more reliable and more reproducible diagnosis and therapy decisions. Unfortunately, scan fusion and longitudinal analysis are both inherently plagued with increased levels of statistical errors. A lack of comprehensive analysis by imaging scientists and a lack of full awareness by physicians pose potential risks in clinical practice. In this paper, we discuss several key error factors affecting imaging quantification, studying their interactions, and introducing a simulation strategy to establish general error bounds for change quantification across time. We quantitatively show that image resolution, voxel anisotropy, lesion size, eccentricity, and orientation are all contributing factors to quantification error; and there is an intricate relationship between voxel anisotropy and lesion shape in affecting quantification error. Specifically, when two or more scans are to be fused at feature level, optimal linear fusion analysis reveals that scans with voxel anisotropy aligned with lesion elongation should receive a higher weight than other scans. As a result of such optimal linear fusion, we will achieve a lower variance than naïve averaging. Simulated experiments are used to validate theoretical predictions. Future work based on the proposed simulation methods may lead to general guidelines and error lower bounds for quantitative image analysis and change detection.

  7. Detailed Uncertainty Analysis of the ZEM-3 Measurement System

    NASA Technical Reports Server (NTRS)

    Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred

    2014-01-01

    The measurement of Seebeck coefficient and electrical resistivity are critical to the investigation of all thermoelectric systems. Therefore, it stands that the measurement uncertainty must be well understood to report ZT values which are accurate and trustworthy. A detailed uncertainty analysis of the ZEM-3 measurement system has been performed. The uncertainty analysis calculates error in the electrical resistivity measurement as a result of sample geometry tolerance, probe geometry tolerance, statistical error, and multi-meter uncertainty. The uncertainty on Seebeck coefficient includes probe wire correction factors, statistical error, multi-meter uncertainty, and most importantly the cold-finger effect. The cold-finger effect plagues all potentiometric (four-probe) Seebeck measurement systems, as heat parasitically transfers through thermocouple probes. The effect leads to an asymmetric over-estimation of the Seebeck coefficient. A thermal finite element analysis allows for quantification of the phenomenon, and provides an estimate on the uncertainty of the Seebeck coefficient. The thermoelectric power factor has been found to have an uncertainty of +9-14 at high temperature and 9 near room temperature.

  8. On Statistical Analysis of Neuroimages with Imperfect Registration

    PubMed Central

    Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas

    2016-01-01

    A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168

  9. Statistical and systematic errors in the measurement of weak-lensing Minkowski functionals: Application to the Canada-France-Hawaii Lensing Survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shirasaki, Masato; Yoshida, Naoki, E-mail: masato.shirasaki@utap.phys.s.u-tokyo.ac.jp

    2014-05-01

    The measurement of cosmic shear using weak gravitational lensing is a challenging task that involves a number of complicated procedures. We study in detail the systematic errors in the measurement of weak-lensing Minkowski Functionals (MFs). Specifically, we focus on systematics associated with galaxy shape measurements, photometric redshift errors, and shear calibration correction. We first generate mock weak-lensing catalogs that directly incorporate the actual observational characteristics of the Canada-France-Hawaii Lensing Survey (CFHTLenS). We then perform a Fisher analysis using the large set of mock catalogs for various cosmological models. We find that the statistical error associated with the observational effects degradesmore » the cosmological parameter constraints by a factor of a few. The Subaru Hyper Suprime-Cam (HSC) survey with a sky coverage of ∼1400 deg{sup 2} will constrain the dark energy equation of the state parameter with an error of Δw {sub 0} ∼ 0.25 by the lensing MFs alone, but biases induced by the systematics can be comparable to the 1σ error. We conclude that the lensing MFs are powerful statistics beyond the two-point statistics only if well-calibrated measurement of both the redshifts and the shapes of source galaxies is performed. Finally, we analyze the CFHTLenS data to explore the ability of the MFs to break degeneracies between a few cosmological parameters. Using a combined analysis of the MFs and the shear correlation function, we derive the matter density Ω{sub m0}=0.256±{sub 0.046}{sup 0.054}.« less

  10. On the statistical assessment of classifiers using DNA microarray data

    PubMed Central

    Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

    2006-01-01

    Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171

  11. Analyzing thematic maps and mapping for accuracy

    USGS Publications Warehouse

    Rosenfield, G.H.

    1982-01-01

    Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.

  12. Evaluation of Satellite and Model Precipitation Products Over Turkey

    NASA Astrophysics Data System (ADS)

    Yilmaz, M. T.; Amjad, M.

    2017-12-01

    Satellite-based remote sensing, gauge stations, and models are the three major platforms to acquire precipitation dataset. Among them satellites and models have the advantage of retrieving spatially and temporally continuous and consistent datasets, while the uncertainty estimates of these retrievals are often required for many hydrological studies to understand the source and the magnitude of the uncertainty in hydrological response parameters. In this study, satellite and model precipitation data products are validated over various temporal scales (daily, 3-daily, 7-daily, 10-daily and monthly) using in-situ measured precipitation observations from a network of 733 gauges from all over the Turkey. Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42 version 7 and European Center of Medium-Range Weather Forecast (ECMWF) model estimates (daily, 3-daily, 7-daily and 10-daily accumulated forecast) are used in this study. Retrievals are evaluated for their mean and standard deviation and their accuracies are evaluated via bias, root mean square error, error standard deviation and correlation coefficient statistics. Intensity vs frequency analysis and some contingency table statistics like percent correct, probability of detection, false alarm ratio and critical success index are determined using daily time-series. Both ECMWF forecasts and TRMM observations, on average, overestimate the precipitation compared to gauge estimates; wet biases are 10.26 mm/month and 8.65 mm/month, respectively for ECMWF and TRMM. RMSE values of ECMWF forecasts and TRMM estimates are 39.69 mm/month and 41.55 mm/month, respectively. Monthly correlations between Gauges-ECMWF, Gauges-TRMM and ECMWF-TRMM are 0.76, 0.73 and 0.81, respectively. The model and the satellite error statistics are further compared against the gauges error statistics based on inverse distance weighting (IWD) analysis. Both the model and satellite data have less IWD errors (14.72 mm/month and 10.75 mm/month, respectively) compared to gauges IWD error (21.58 mm/month). These results show that, on average, ECMWF forecast data have higher skill than TRMM observations. Overall, both ECMWF forecast data and TRMM observations show good potential for catchment scale hydrological analysis.

  13. Ephemeris data and error analysis in support of a Comet Encke intercept mission

    NASA Technical Reports Server (NTRS)

    Yeomans, D. K.

    1974-01-01

    Utilizing an orbit determination based upon 65 observations over the 1961 - 1973 interval, ephemeris data were generated for the 1976-77, 1980-81 and 1983-84 apparitions of short period comet Encke. For the 1980-81 apparition, results from a statistical error analysis are outlined. All ephemeris and error analysis computations include the effects of planetary perturbations as well as the nongravitational accelerations introduced by the outgassing cometary nucleus. In 1980, excellent observing conditions and a close approach of comet Encke to the earth permit relatively small uncertainties in the cometary position errors and provide an excellent opportunity for a close flyby of a physically interesting comet.

  14. The Application of Strength of Association Statistics to the Item Analysis of an In-Training Examination in Diagnostic Radiology.

    ERIC Educational Resources Information Center

    Diamond, James J.; McCormick, Janet

    1986-01-01

    Using item responses from an in-training examination in diagnostic radiology, the application of a strength of association statistic to the general problem of item analysis is illustrated. Criteria for item selection, general issues of reliability, and error of measurement are discussed. (Author/LMO)

  15. The Use of Meta-Analytic Statistical Significance Testing

    ERIC Educational Resources Information Center

    Polanin, Joshua R.; Pigott, Terri D.

    2015-01-01

    Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…

  16. Is a shift from research on individual medical error to research on health information technology underway? A 40-year analysis of publication trends in medical journals.

    PubMed

    Erlewein, Daniel; Bruni, Tommaso; Gadebusch Bondio, Mariacarla

    2018-06-07

    In 1983, McIntyre and Popper underscored the need for more openness in dealing with errors in medicine. Since then, much has been written on individual medical errors. Furthermore, at the beginning of the 21st century, researchers and medical practitioners increasingly approached individual medical errors through health information technology. Hence, the question arises whether the attention of biomedical researchers shifted from individual medical errors to health information technology. We ran a study to determine publication trends concerning individual medical errors and health information technology in medical journals over the last 40 years. We used the Medical Subject Headings (MeSH) taxonomy in the database MEDLINE. Each year, we analyzed the percentage of relevant publications to the total number of publications in MEDLINE. The trends identified were tested for statistical significance. Our analysis showed that the percentage of publications dealing with individual medical errors increased from 1976 until the beginning of the 21st century but began to drop in 2003. Both the upward and the downward trends were statistically significant (P < 0.001). A breakdown by country revealed that it was the weight of the US and British publications that determined the overall downward trend after 2003. On the other hand, the percentage of publications dealing with health information technology doubled between 2003 and 2015. The upward trend was statistically significant (P < 0.001). The identified trends suggest that the attention of biomedical researchers partially shifted from individual medical errors to health information technology in the USA and the UK. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.

  17. Probabilistic performance estimators for computational chemistry methods: The empirical cumulative distribution function of absolute errors

    NASA Astrophysics Data System (ADS)

    Pernot, Pascal; Savin, Andreas

    2018-06-01

    Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.

  18. Development of an errorable car-following driver model

    NASA Astrophysics Data System (ADS)

    Yang, H.-H.; Peng, H.

    2010-06-01

    An errorable car-following driver model is presented in this paper. An errorable driver model is one that emulates human driver's functions and can generate both nominal (error-free), as well as devious (with error) behaviours. This model was developed for evaluation and design of active safety systems. The car-following data used for developing and validating the model were obtained from a large-scale naturalistic driving database. The stochastic car-following behaviour was first analysed and modelled as a random process. Three error-inducing behaviours were then introduced. First, human perceptual limitation was studied and implemented. Distraction due to non-driving tasks was then identified based on the statistical analysis of the driving data. Finally, time delay of human drivers was estimated through a recursive least-square identification process. By including these three error-inducing behaviours, rear-end collisions with the lead vehicle could occur. The simulated crash rate was found to be similar but somewhat higher than that reported in traffic statistics.

  19. Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study

    PubMed Central

    Hou, Lin; Sun, Ning; Mane, Shrikant; Sayward, Fred; Rajeevan, Nallakkandi; Cheung, Kei-Hoi; Cho, Kelly; Pyarajan, Saiju; Aslan, Mihaela; Miller, Perry; Harvey, Philip D.; Gaziano, J. Michael; Concato, John; Zhao, Hongyu

    2017-01-01

    A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant’s DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/). PMID:28019059

  20. Robust Linear Models for Cis-eQTL Analysis.

    PubMed

    Rantalainen, Mattias; Lindgren, Cecilia M; Holmes, Christopher C

    2015-01-01

    Expression Quantitative Trait Loci (eQTL) analysis enables characterisation of functional genetic variation influencing expression levels of individual genes. In outbread populations, including humans, eQTLs are commonly analysed using the conventional linear model, adjusting for relevant covariates, assuming an allelic dosage model and a Gaussian error term. However, gene expression data generally have noise that induces heavy-tailed errors relative to the Gaussian distribution and often include atypical observations, or outliers. Such departures from modelling assumptions can lead to an increased rate of type II errors (false negatives), and to some extent also type I errors (false positives). Careful model checking can reduce the risk of type-I errors but often not type II errors, since it is generally too time-consuming to carefully check all models with a non-significant effect in large-scale and genome-wide studies. Here we propose the application of a robust linear model for eQTL analysis to reduce adverse effects of deviations from the assumption of Gaussian residuals. We present results from a simulation study as well as results from the analysis of real eQTL data sets. Our findings suggest that in many situations robust models have the potential to provide more reliable eQTL results compared to conventional linear models, particularly in respect to reducing type II errors due to non-Gaussian noise. Post-genomic data, such as that generated in genome-wide eQTL studies, are often noisy and frequently contain atypical observations. Robust statistical models have the potential to provide more reliable results and increased statistical power under non-Gaussian conditions. The results presented here suggest that robust models should be considered routinely alongside other commonly used methodologies for eQTL analysis.

  1. Noise removing in encrypted color images by statistical analysis

    NASA Astrophysics Data System (ADS)

    Islam, N.; Puech, W.

    2012-03-01

    Cryptographic techniques are used to secure confidential data from unauthorized access but these techniques are very sensitive to noise. A single bit change in encrypted data can have catastrophic impact over the decrypted data. This paper addresses the problem of removing bit error in visual data which are encrypted using AES algorithm in the CBC mode. In order to remove the noise, a method is proposed which is based on the statistical analysis of each block during the decryption. The proposed method exploits local statistics of the visual data and confusion/diffusion properties of the encryption algorithm to remove the errors. Experimental results show that the proposed method can be used at the receiving end for the possible solution for noise removing in visual data in encrypted domain.

  2. Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kleijnen, J.P.C.; Helton, J.C.

    1999-04-01

    The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are consideredmore » for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.« less

  3. Multi-reader ROC studies with split-plot designs: a comparison of statistical methods.

    PubMed

    Obuchowski, Nancy A; Gallas, Brandon D; Hillis, Stephen L

    2012-12-01

    Multireader imaging trials often use a factorial design, in which study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of this design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper, the authors compare three methods of analysis for the split-plot design. Three statistical methods are presented: the Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean analysis-of-variance approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power, and confidence interval coverage of the three test statistics. The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% confidence intervals falls close to the nominal coverage for small and large sample sizes. The split-plot multireader, multicase study design can be statistically efficient compared to the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rates, similar power, and nominal confidence interval coverage, are available for this study design. Copyright © 2012 AUR. All rights reserved.

  4. A comparison of different statistical methods analyzing hypoglycemia data using bootstrap simulations.

    PubMed

    Jiang, Honghua; Ni, Xiao; Huster, William; Heilmann, Cory

    2015-01-01

    Hypoglycemia has long been recognized as a major barrier to achieving normoglycemia with intensive diabetic therapies. It is a common safety concern for the diabetes patients. Therefore, it is important to apply appropriate statistical methods when analyzing hypoglycemia data. Here, we carried out bootstrap simulations to investigate the performance of the four commonly used statistical models (Poisson, negative binomial, analysis of covariance [ANCOVA], and rank ANCOVA) based on the data from a diabetes clinical trial. Zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model were also evaluated. Simulation results showed that Poisson model inflated type I error, while negative binomial model was overly conservative. However, after adjusting for dispersion, both Poisson and negative binomial models yielded slightly inflated type I errors, which were close to the nominal level and reasonable power. Reasonable control of type I error was associated with ANCOVA model. Rank ANCOVA model was associated with the greatest power and with reasonable control of type I error. Inflated type I error was observed with ZIP and ZINB models.

  5. Linear and Order Statistics Combiners for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

    2001-01-01

    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

  6. Assessment of the knowledge and attitudes of intern doctors to medication prescribing errors in a Nigeria tertiary hospital

    PubMed Central

    Ajemigbitse, Adetutu A.; Omole, Moses Kayode; Ezike, Nnamdi Chika; Erhun, Wilson O.

    2013-01-01

    Context: Junior doctors are reported to make most of the prescribing errors in the hospital setting. Aims: The aim of the following study is to determine the knowledge intern doctors have about prescribing errors and circumstances contributing to making them. Settings and Design: A structured questionnaire was distributed to intern doctors in National Hospital Abuja Nigeria. Subjects and Methods: Respondents gave information about their experience with prescribing medicines, the extent to which they agreed with the definition of a clinically meaningful prescribing error and events that constituted such. Their experience with prescribing certain categories of medicines was also sought. Statistical Analysis Used: Data was analyzed with Statistical Package for the Social Sciences (SPSS) software version 17 (SPSS Inc Chicago, Ill, USA). Chi-squared analysis contrasted differences in proportions; P < 0.05 was considered to be statistically significant. Results: The response rate was 90.9% and 27 (90%) had <1 year of prescribing experience. 17 (56.7%) respondents totally agreed with the definition of a clinically meaningful prescribing error. Most common reasons for prescribing mistakes were a failure to check prescriptions with a reference source (14, 25.5%) and failure to check for adverse drug interactions (14, 25.5%). Omitting some essential information such as duration of therapy (13, 20%), patient age (14, 21.5%) and dosage errors (14, 21.5%) were the most common types of prescribing errors made. Respondents considered workload (23, 76.7%), multitasking (19, 63.3%), rushing (18, 60.0%) and tiredness/stress (16, 53.3%) as important factors contributing to prescribing errors. Interns were least confident prescribing antibiotics (12, 25.5%), opioid analgesics (12, 25.5%) cytotoxics (10, 21.3%) and antipsychotics (9, 19.1%) unsupervised. Conclusions: Respondents seemed to have a low awareness of making prescribing errors. Principles of rational prescribing and events that constitute prescribing errors should be taught in the practice setting. PMID:24808682

  7. Laser Velocimeter Measurements and Analysis in Turbulent Flows with Combustion. Part 2.

    DTIC Science & Technology

    1983-07-01

    sampling error for 63 this sample size. Mean velocities and turbulence intensi- ties were found to be statistically accurate to ± 1 % and 13%, respectively...Although the statist - ical error was found to be rather small (± 1 % for mean velo- cities and 13% for turbulence intensities), there can be additional...34Computational and Experimental Study of a Captive Annular Eddy," Journal of Fluid Mechanics, Vol. 28, pt. 1 , pp. 43-63, 12 April, 1967. 152 REFERENCES (con’d

  8. Measurement and Predition Errors in Body Composition Assessment and the Search for the Perfect Prediction Equation.

    ERIC Educational Resources Information Center

    Katch, Frank I.; Katch, Victor L.

    1980-01-01

    Sources of error in body composition assessment by laboratory and field methods can be found in hydrostatic weighing, residual air volume, skinfolds, and circumferences. Statistical analysis can and should be used in the measurement of body composition. (CJ)

  9. BATMAN: Bayesian Technique for Multi-image Analysis

    NASA Astrophysics Data System (ADS)

    Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.

    2017-04-01

    This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical data set containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (I.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real integral-field spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of 2. Our analysis reveals that the algorithm prioritizes conservation of all the statistically significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BaTMAn is not to be used as a 'black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.

  10. Estimation of sensible and latent heat flux from natural sparse vegetation surfaces using surface renewal

    NASA Astrophysics Data System (ADS)

    Zapata, N.; Martínez-Cob, A.

    2001-12-01

    This paper reports a study undertaken to evaluate the feasibility of the surface renewal method to accurately estimate long-term evaporation from the playa and margins of an endorreic salty lagoon (Gallocanta lagoon, Spain) under semiarid conditions. High-frequency temperature readings were taken for two time lags ( r) and three measurement heights ( z) in order to get surface renewal sensible heat flux ( HSR) values. These values were compared against eddy covariance sensible heat flux ( HEC) values for a calibration period (25-30 July 2000). Error analysis statistics (index of agreement, IA; root mean square error, RMSE; and systematic mean square error, MSEs) showed that the agreement between HSR and HEC improved as measurement height decreased and time lag increased. Calibration factors α were obtained for all analyzed cases. The best results were obtained for the z=0.9 m ( r=0.75 s) case for which α=1.0 was observed. In this case, uncertainty was about 10% in terms of relative error ( RE). Latent heat flux values were obtained by solving the energy balance equation for both the surface renewal ( LESR) and the eddy covariance ( LEEC) methods, using HSR and HEC, respectively, and measurements of net radiation and soil heat flux. For the calibration period, error analysis statistics for LESR were quite similar to those for HSR, although errors were mostly at random. LESR uncertainty was less than 9%. Calibration factors were applied for a validation data subset (30 July-4 August 2000) for which meteorological conditions were somewhat different (higher temperatures and wind speed and lower solar and net radiation). Error analysis statistics for both HSR and LESR were quite good for all cases showing the goodness of the calibration factors. Nevertheless, the results obtained for the z=0.9 m ( r=0.75 s) case were still the best ones.

  11. Evaluation of errors in quantitative determination of asbestos in rock

    NASA Astrophysics Data System (ADS)

    Baietto, Oliviero; Marini, Paola; Vitaliti, Martina

    2016-04-01

    The quantitative determination of the content of asbestos in rock matrices is a complex operation which is susceptible to important errors. The principal methodologies for the analysis are Scanning Electron Microscopy (SEM) and Phase Contrast Optical Microscopy (PCOM). Despite the PCOM resolution is inferior to that of SEM, PCOM analysis has several advantages, including more representativity of the analyzed sample, more effective recognition of chrysotile and a lower cost. The DIATI LAA internal methodology for the analysis in PCOM is based on a mild grinding of a rock sample, its subdivision in 5-6 grain size classes smaller than 2 mm and a subsequent microscopic analysis of a portion of each class. The PCOM is based on the optical properties of asbestos and of the liquids with note refractive index in which the particles in analysis are immersed. The error evaluation in the analysis of rock samples, contrary to the analysis of airborne filters, cannot be based on a statistical distribution. In fact for airborne filters a binomial distribution (Poisson), which theoretically defines the variation in the count of fibers resulting from the observation of analysis fields, chosen randomly on the filter, can be applied. The analysis in rock matrices instead cannot lean on any statistical distribution because the most important object of the analysis is the size of the of asbestiform fibers and bundles of fibers observed and the resulting relationship between the weights of the fibrous component compared to the one granular. The error evaluation generally provided by public and private institutions varies between 50 and 150 percent, but there are not, however, specific studies that discuss the origin of the error or that link it to the asbestos content. Our work aims to provide a reliable estimation of the error in relation to the applied methodologies and to the total content of asbestos, especially for the values close to the legal limits. The error assessments must be made through the repetition of the same analysis on the same sample to try to estimate the error on the representativeness of the sample and the error related to the sensitivity of the operator, in order to provide a sufficiently reliable uncertainty of the method. We used about 30 natural rock samples with different asbestos content, performing 3 analysis on each sample to obtain a trend sufficiently representative of the percentage. Furthermore we made on one chosen sample 10 repetition of the analysis to try to define more specifically the error of the methodology.

  12. A method for automatic feature points extraction of human vertebrae three-dimensional model

    NASA Astrophysics Data System (ADS)

    Wu, Zhen; Wu, Junsheng

    2017-05-01

    A method for automatic extraction of the feature points of the human vertebrae three-dimensional model is presented. Firstly, the statistical model of vertebrae feature points is established based on the results of manual vertebrae feature points extraction. Then anatomical axial analysis of the vertebrae model is performed according to the physiological and morphological characteristics of the vertebrae. Using the axial information obtained from the analysis, a projection relationship between the statistical model and the vertebrae model to be extracted is established. According to the projection relationship, the statistical model is matched with the vertebrae model to get the estimated position of the feature point. Finally, by analyzing the curvature in the spherical neighborhood with the estimated position of feature points, the final position of the feature points is obtained. According to the benchmark result on multiple test models, the mean relative errors of feature point positions are less than 5.98%. At more than half of the positions, the error rate is less than 3% and the minimum mean relative error is 0.19%, which verifies the effectiveness of the method.

  13. Generalized Background Error covariance matrix model (GEN_BE v2.0)

    NASA Astrophysics Data System (ADS)

    Descombes, G.; Auligné, T.; Vandenberghe, F.; Barker, D. M.

    2014-07-01

    The specification of state background error statistics is a key component of data assimilation since it affects the impact observations will have on the analysis. In the variational data assimilation approach, applied in geophysical sciences, the dimensions of the background error covariance matrix (B) are usually too large to be explicitly determined and B needs to be modeled. Recent efforts to include new variables in the analysis such as cloud parameters and chemical species have required the development of the code to GENerate the Background Errors (GEN_BE) version 2.0 for the Weather Research and Forecasting (WRF) community model to allow for a simpler, flexible, robust, and community-oriented framework that gathers methods used by meteorological operational centers and researchers. We present the advantages of this new design for the data assimilation community by performing benchmarks and showing some of the new features on data assimilation test cases. As data assimilation for clouds remains a challenge, we present a multivariate approach that includes hydrometeors in the control variables and new correlated errors. In addition, the GEN_BE v2.0 code is employed to diagnose error parameter statistics for chemical species, which shows that it is a tool flexible enough to involve new control variables. While the generation of the background errors statistics code has been first developed for atmospheric research, the new version (GEN_BE v2.0) can be easily extended to other domains of science and be chosen as a testbed for diagnostic and new modeling of B. Initially developed for variational data assimilation, the model of the B matrix may be useful for variational ensemble hybrid methods as well.

  14. Generalized background error covariance matrix model (GEN_BE v2.0)

    NASA Astrophysics Data System (ADS)

    Descombes, G.; Auligné, T.; Vandenberghe, F.; Barker, D. M.; Barré, J.

    2015-03-01

    The specification of state background error statistics is a key component of data assimilation since it affects the impact observations will have on the analysis. In the variational data assimilation approach, applied in geophysical sciences, the dimensions of the background error covariance matrix (B) are usually too large to be explicitly determined and B needs to be modeled. Recent efforts to include new variables in the analysis such as cloud parameters and chemical species have required the development of the code to GENerate the Background Errors (GEN_BE) version 2.0 for the Weather Research and Forecasting (WRF) community model. GEN_BE allows for a simpler, flexible, robust, and community-oriented framework that gathers methods used by some meteorological operational centers and researchers. We present the advantages of this new design for the data assimilation community by performing benchmarks of different modeling of B and showing some of the new features in data assimilation test cases. As data assimilation for clouds remains a challenge, we present a multivariate approach that includes hydrometeors in the control variables and new correlated errors. In addition, the GEN_BE v2.0 code is employed to diagnose error parameter statistics for chemical species, which shows that it is a tool flexible enough to implement new control variables. While the generation of the background errors statistics code was first developed for atmospheric research, the new version (GEN_BE v2.0) can be easily applied to other domains of science and chosen to diagnose and model B. Initially developed for variational data assimilation, the model of the B matrix may be useful for variational ensemble hybrid methods as well.

  15. Relevant reduction effect with a modified thermoplastic mask of rotational error for glottic cancer in IMRT

    NASA Astrophysics Data System (ADS)

    Jung, Jae Hong; Jung, Joo-Young; Cho, Kwang Hwan; Ryu, Mi Ryeong; Bae, Sun Hyun; Moon, Seong Kwon; Kim, Yong Ho; Choe, Bo-Young; Suh, Tae Suk

    2017-02-01

    The purpose of this study was to analyze the glottis rotational error (GRE) by using a thermoplastic mask for patients with the glottic cancer undergoing intensity-modulated radiation therapy (IMRT). We selected 20 patients with glottic cancer who had received IMRT by using the tomotherapy. The image modalities with both kilovoltage computed tomography (planning kVCT) and megavoltage CT (daily MVCT) images were used for evaluating the error. Six anatomical landmarks in the image were defined to evaluate a correlation between the absolute GRE (°) and the length of contact with the underlying skin of the patient by the mask (mask, mm). We also statistically analyzed the results by using the Pearson's correlation coefficient and a linear regression analysis ( P <0.05). The mask and the absolute GRE were verified to have a statistical correlation ( P < 0.01). We found a statistical significance for each parameter in the linear regression analysis (mask versus absolute roll: P = 0.004 [ P < 0.05]; mask versus 3D-error: P = 0.000 [ P < 0.05]). The range of the 3D-errors with contact by the mask was from 1.2% - 39.7% between the maximumand no-contact case in this study. A thermoplastic mask with a tight, increased contact area may possibly contribute to the uncertainty of the reproducibility as a variation of the absolute GRE. Thus, we suggest that a modified mask, such as one that covers only the glottis area, can significantly reduce the patients' setup errors during the treatment.

  16. Particle simulation of Coulomb collisions: Comparing the methods of Takizuka and Abe and Nanbu

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang Chiaming; Lin, Tungyou; Caflisch, Russel

    2008-04-20

    The interactions of charged particles in a plasma are governed by long-range Coulomb collision. We compare two widely used Monte Carlo models for Coulomb collisions. One was developed by Takizuka and Abe in 1977, the other was developed by Nanbu in 1997. We perform deterministic and statistical error analysis with respect to particle number and time step. The two models produce similar stochastic errors, but Nanbu's model gives smaller time step errors. Error comparisons between these two methods are presented.

  17. Spectral Analysis of Forecast Error Investigated with an Observing System Simulation Experiment

    NASA Technical Reports Server (NTRS)

    Prive, N. C.; Errico, Ronald M.

    2015-01-01

    The spectra of analysis and forecast error are examined using the observing system simulation experiment (OSSE) framework developed at the National Aeronautics and Space Administration Global Modeling and Assimilation Office (NASAGMAO). A global numerical weather prediction model, the Global Earth Observing System version 5 (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation, is cycled for two months with once-daily forecasts to 336 hours to generate a control case. Verification of forecast errors using the Nature Run as truth is compared with verification of forecast errors using self-analysis; significant underestimation of forecast errors is seen using self-analysis verification for up to 48 hours. Likewise, self analysis verification significantly overestimates the error growth rates of the early forecast, as well as mischaracterizing the spatial scales at which the strongest growth occurs. The Nature Run-verified error variances exhibit a complicated progression of growth, particularly for low wave number errors. In a second experiment, cycling of the model and data assimilation over the same period is repeated, but using synthetic observations with different explicitly added observation errors having the same error variances as the control experiment, thus creating a different realization of the control. The forecast errors of the two experiments become more correlated during the early forecast period, with correlations increasing for up to 72 hours before beginning to decrease.

  18. Experimental toxicology: Issues of statistics, experimental design, and replication.

    PubMed

    Briner, Wayne; Kirwan, Jeral

    2017-01-01

    The difficulty of replicating experiments has drawn considerable attention. Issues with replication occur for a variety of reasons ranging from experimental design to laboratory errors to inappropriate statistical analysis. Here we review a variety of guidelines for statistical analysis, design, and execution of experiments in toxicology. In general, replication can be improved by using hypothesis driven experiments with adequate sample sizes, randomization, and blind data collection techniques. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

    USGS Publications Warehouse

    Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

    2015-09-28

    Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.

  20. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

    PubMed

    Willis, Brian H; Riley, Richard D

    2017-09-20

    An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  1. Evaluation of The Operational Benefits Versus Costs of An Automated Cargo Mover

    DTIC Science & Technology

    2016-12-01

    logistics footprint and life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically...life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically significant differences...Error of Estimation. Source: Eskew and Lawler (1994). ...........................75 Figure 24. Load Results (100 Runs per Scenario

  2. Why Are People Bad at Detecting Randomness? A Statistical Argument

    ERIC Educational Resources Information Center

    Williams, Joseph J.; Griffiths, Thomas L.

    2013-01-01

    Errors in detecting randomness are often explained in terms of biases and misconceptions. We propose and provide evidence for an account that characterizes the contribution of the inherent statistical difficulty of the task. Our account is based on a Bayesian statistical analysis, focusing on the fact that a random process is a special case of…

  3. On the Statistical Errors of RADAR Location Sensor Networks with Built-In Wi-Fi Gaussian Linear Fingerprints

    PubMed Central

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis. PMID:22737027

  4. On the statistical errors of RADAR location sensor networks with built-in Wi-Fi Gaussian linear fingerprints.

    PubMed

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis.

  5. Consequences of common data analysis inaccuracies in CNS trauma injury basic research.

    PubMed

    Burke, Darlene A; Whittemore, Scott R; Magnuson, David S K

    2013-05-15

    The development of successful treatments for humans after traumatic brain or spinal cord injuries (TBI and SCI, respectively) requires animal research. This effort can be hampered when promising experimental results cannot be replicated because of incorrect data analysis procedures. To identify and hopefully avoid these errors in future studies, the articles in seven journals with the highest number of basic science central nervous system TBI and SCI animal research studies published in 2010 (N=125 articles) were reviewed for their data analysis procedures. After identifying the most common statistical errors, the implications of those findings were demonstrated by reanalyzing previously published data from our laboratories using the identified inappropriate statistical procedures, then comparing the two sets of results. Overall, 70% of the articles contained at least one type of inappropriate statistical procedure. The highest percentage involved incorrect post hoc t-tests (56.4%), followed by inappropriate parametric statistics (analysis of variance and t-test; 37.6%). Repeated Measures analysis was inappropriately missing in 52.0% of all articles and, among those with behavioral assessments, 58% were analyzed incorrectly. Reanalysis of our published data using the most common inappropriate statistical procedures resulted in a 14.1% average increase in significant effects compared to the original results. Specifically, an increase of 15.5% occurred with Independent t-tests and 11.1% after incorrect post hoc t-tests. Utilizing proper statistical procedures can allow more-definitive conclusions, facilitate replicability of research results, and enable more accurate translation of those results to the clinic.

  6. How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

    PubMed

    West, Brady T; Sakshaug, Joseph W; Aurelien, Guy Alain S

    2016-01-01

    Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.

  7. How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

    PubMed Central

    West, Brady T.; Sakshaug, Joseph W.; Aurelien, Guy Alain S.

    2016-01-01

    Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data. PMID:27355817

  8. Flux control coefficients determined by inhibitor titration: the design and analysis of experiments to minimize errors.

    PubMed Central

    Small, J R

    1993-01-01

    This paper is a study into the effects of experimental error on the estimated values of flux control coefficients obtained using specific inhibitors. Two possible techniques for analysing the experimental data are compared: a simple extrapolation method (the so-called graph method) and a non-linear function fitting method. For these techniques, the sources of systematic errors are identified and the effects of systematic and random errors are quantified, using both statistical analysis and numerical computation. It is shown that the graph method is very sensitive to random errors and, under all conditions studied, that the fitting method, even under conditions where the assumptions underlying the fitted function do not hold, outperformed the graph method. Possible ways of designing experiments to minimize the effects of experimental errors are analysed and discussed. PMID:8257434

  9. Non-linear matter power spectrum covariance matrix errors and cosmological parameter uncertainties

    NASA Astrophysics Data System (ADS)

    Blot, L.; Corasaniti, P. S.; Amendola, L.; Kitching, T. D.

    2016-06-01

    The covariance of the matter power spectrum is a key element of the analysis of galaxy clustering data. Independent realizations of observational measurements can be used to sample the covariance, nevertheless statistical sampling errors will propagate into the cosmological parameter inference potentially limiting the capabilities of the upcoming generation of galaxy surveys. The impact of these errors as function of the number of realizations has been previously evaluated for Gaussian distributed data. However, non-linearities in the late-time clustering of matter cause departures from Gaussian statistics. Here, we address the impact of non-Gaussian errors on the sample covariance and precision matrix errors using a large ensemble of N-body simulations. In the range of modes where finite volume effects are negligible (0.1 ≲ k [h Mpc-1] ≲ 1.2), we find deviations of the variance of the sample covariance with respect to Gaussian predictions above ˜10 per cent at k > 0.3 h Mpc-1. Over the entire range these reduce to about ˜5 per cent for the precision matrix. Finally, we perform a Fisher analysis to estimate the effect of covariance errors on the cosmological parameter constraints. In particular, assuming Euclid-like survey characteristics we find that a number of independent realizations larger than 5000 is necessary to reduce the contribution of sampling errors to the cosmological parameter uncertainties at subpercent level. We also show that restricting the analysis to large scales k ≲ 0.2 h Mpc-1 results in a considerable loss in constraining power, while using the linear covariance to include smaller scales leads to an underestimation of the errors on the cosmological parameters.

  10. Observations of geographically correlated orbit errors for TOPEX/Poseidon using the global positioning system

    NASA Technical Reports Server (NTRS)

    Christensen, E. J.; Haines, B. J.; Mccoll, K. C.; Nerem, R. S.

    1994-01-01

    We have compared Global Positioning System (GPS)-based dynamic and reduced-dynamic TOPEX/Poseidon orbits over three 10-day repeat cycles of the ground-track. The results suggest that the prelaunch joint gravity model (JGM-1) introduces geographically correlated errors (GCEs) which have a strong meridional dependence. The global distribution and magnitude of these GCEs are consistent with a prelaunch covariance analysis, with estimated and predicted global rms error statistics of 2.3 and 2.4 cm rms, respectively. Repeating the analysis with the post-launch joint gravity model (JGM-2) suggests that a portion of the meridional dependence observed in JGM-1 still remains, with global rms error of 1.2 cm.

  11. On-line estimation of error covariance parameters for atmospheric data assimilation

    NASA Technical Reports Server (NTRS)

    Dee, Dick P.

    1995-01-01

    A simple scheme is presented for on-line estimation of covariance parameters in statistical data assimilation systems. The scheme is based on a maximum-likelihood approach in which estimates are produced on the basis of a single batch of simultaneous observations. Simple-sample covariance estimation is reasonable as long as the number of available observations exceeds the number of tunable parameters by two or three orders of magnitude. Not much is known at present about model error associated with actual forecast systems. Our scheme can be used to estimate some important statistical model error parameters such as regionally averaged variances or characteristic correlation length scales. The advantage of the single-sample approach is that it does not rely on any assumptions about the temporal behavior of the covariance parameters: time-dependent parameter estimates can be continuously adjusted on the basis of current observations. This is of practical importance since it is likely to be the case that both model error and observation error strongly depend on the actual state of the atmosphere. The single-sample estimation scheme can be incorporated into any four-dimensional statistical data assimilation system that involves explicit calculation of forecast error covariances, including optimal interpolation (OI) and the simplified Kalman filter (SKF). The computational cost of the scheme is high but not prohibitive; on-line estimation of one or two covariance parameters in each analysis box of an operational bozed-OI system is currently feasible. A number of numerical experiments performed with an adaptive SKF and an adaptive version of OI, using a linear two-dimensional shallow-water model and artificially generated model error are described. The performance of the nonadaptive versions of these methods turns out to depend rather strongly on correct specification of model error parameters. These parameters are estimated under a variety of conditions, including uniformly distributed model error and time-dependent model error statistics.

  12. Statistical analysis of the determinations of the Sun's Galactocentric distance

    NASA Astrophysics Data System (ADS)

    Malkin, Zinovy

    2013-02-01

    Based on several tens of R0 measurements made during the past two decades, several studies have been performed to derive the best estimate of R0. Some used just simple averaging to derive a result, whereas others provided comprehensive analyses of possible errors in published results. In either case, detailed statistical analyses of data used were not performed. However, a computation of the best estimates of the Galactic rotation constants is not only an astronomical but also a metrological task. Here we perform an analysis of 53 R0 measurements (published in the past 20 years) to assess the consistency of the data. Our analysis shows that they are internally consistent. It is also shown that any trend in the R0 estimates from the last 20 years is statistically negligible, which renders the presence of a bandwagon effect doubtful. On the other hand, the formal errors in the published R0 estimates improve significantly with time.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Labby, Z.

    Physicists are often expected to have a solid grounding in experimental design and statistical analysis, sometimes filling in when biostatisticians or other experts are not available for consultation. Unfortunately, graduate education on these topics is seldom emphasized and few opportunities for continuing education exist. Clinical physicists incorporate new technology and methods into their practice based on published literature. A poor understanding of experimental design and analysis could Result in inappropriate use of new techniques. Clinical physicists also improve current practice through quality initiatives that require sound experimental design and analysis. Academic physicists with a poor understanding of design and analysismore » may produce ambiguous (or misleading) results. This can Result in unnecessary rewrites, publication rejection, and experimental redesign (wasting time, money, and effort). This symposium will provide a practical review of error and uncertainty, common study designs, and statistical tests. Instruction will primarily focus on practical implementation through examples and answer questions such as: where would you typically apply the test/design and where is the test/design typically misapplied (i.e., common pitfalls)? An analysis of error and uncertainty will also be explored using biological studies and associated modeling as a specific use case. Learning Objectives: Understand common experimental testing and clinical trial designs, what questions they can answer, and how to interpret the results Determine where specific statistical tests are appropriate and identify common pitfalls Understand the how uncertainty and error are addressed in biological testing and associated biological modeling.« less

  14. Radar error statistics for the space shuttle

    NASA Technical Reports Server (NTRS)

    Lear, W. M.

    1979-01-01

    Radar error statistics of C-band and S-band that are recommended for use with the groundtracking programs to process space shuttle tracking data are presented. The statistics are divided into two parts: bias error statistics, using the subscript B, and high frequency error statistics, using the subscript q. Bias errors may be slowly varying to constant. High frequency random errors (noise) are rapidly varying and may or may not be correlated from sample to sample. Bias errors were mainly due to hardware defects and to errors in correction for atmospheric refraction effects. High frequency noise was mainly due to hardware and due to atmospheric scintillation. Three types of atmospheric scintillation were identified: horizontal, vertical, and line of sight. This was the first time that horizontal and line of sight scintillations were identified.

  15. A method to estimate the effect of deformable image registration uncertainties on daily dose mapping

    PubMed Central

    Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin

    2012-01-01

    Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766

  16. Identifying Autocorrelation Generated by Various Error Processes in Interrupted Time-Series Regression Designs: A Comparison of AR1 and Portmanteau Tests

    ERIC Educational Resources Information Center

    Huitema, Bradley E.; McKean, Joseph W.

    2007-01-01

    Regression models used in the analysis of interrupted time-series designs assume statistically independent errors. Four methods of evaluating this assumption are the Durbin-Watson (D-W), Huitema-McKean (H-M), Box-Pierce (B-P), and Ljung-Box (L-B) tests. These tests were compared with respect to Type I error and power under a wide variety of error…

  17. Modification of Poisson Distribution in Radioactive Particle Counting.

    ERIC Educational Resources Information Center

    Drotter, Michael T.

    This paper focuses on radioactive practicle counting statistics in laboratory and field applications, intended to aid the Health Physics technician's understanding of the effect of indeterminant errors on radioactive particle counting. It indicates that although the statistical analysis of radioactive disintegration is best described by a Poisson…

  18. Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

    PubMed

    Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

    2015-08-01

    Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.

  19. Measuring the statistical validity of summary meta‐analysis and meta‐regression results for use in clinical practice

    PubMed Central

    Riley, Richard D.

    2017-01-01

    An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945

  20. An analysis of pilot error-related aircraft accidents

    NASA Technical Reports Server (NTRS)

    Kowalsky, N. B.; Masters, R. L.; Stone, R. B.; Babcock, G. L.; Rypka, E. W.

    1974-01-01

    A multidisciplinary team approach to pilot error-related U.S. air carrier jet aircraft accident investigation records successfully reclaimed hidden human error information not shown in statistical studies. New analytic techniques were developed and applied to the data to discover and identify multiple elements of commonality and shared characteristics within this group of accidents. Three techniques of analysis were used: Critical element analysis, which demonstrated the importance of a subjective qualitative approach to raw accident data and surfaced information heretofore unavailable. Cluster analysis, which was an exploratory research tool that will lead to increased understanding and improved organization of facts, the discovery of new meaning in large data sets, and the generation of explanatory hypotheses. Pattern recognition, by which accidents can be categorized by pattern conformity after critical element identification by cluster analysis.

  1. Putting Meaning Back Into the Mean: A Comment on the Misuse of Elementary Statistics in a Sample of Manuscripts Submitted to Clinical Therapeutics.

    PubMed

    Forrester, Janet E

    2015-12-01

    Errors in the statistical presentation and analyses of data in the medical literature remain common despite efforts to improve the review process, including the creation of guidelines for authors and the use of statistical reviewers. This article discusses common elementary statistical errors seen in manuscripts recently submitted to Clinical Therapeutics and describes some ways in which authors and reviewers can identify errors and thus correct them before publication. A nonsystematic sample of manuscripts submitted to Clinical Therapeutics over the past year was examined for elementary statistical errors. Clinical Therapeutics has many of the same errors that reportedly exist in other journals. Authors require additional guidance to avoid elementary statistical errors and incentives to use the guidance. Implementation of reporting guidelines for authors and reviewers by journals such as Clinical Therapeutics may be a good approach to reduce the rate of statistical errors. Copyright © 2015 Elsevier HS Journals, Inc. All rights reserved.

  2. A Modeling Framework for Optimal Computational Resource Allocation Estimation: Considering the Trade-offs between Physical Resolutions, Uncertainty and Computational Costs

    NASA Astrophysics Data System (ADS)

    Moslehi, M.; de Barros, F.; Rajagopal, R.

    2014-12-01

    Hydrogeological models that represent flow and transport in subsurface domains are usually large-scale with excessive computational complexity and uncertain characteristics. Uncertainty quantification for predicting flow and transport in heterogeneous formations often entails utilizing a numerical Monte Carlo framework, which repeatedly simulates the model according to a random field representing hydrogeological characteristics of the field. The physical resolution (e.g. grid resolution associated with the physical space) for the simulation is customarily chosen based on recommendations in the literature, independent of the number of Monte Carlo realizations. This practice may lead to either excessive computational burden or inaccurate solutions. We propose an optimization-based methodology that considers the trade-off between the following conflicting objectives: time associated with computational costs, statistical convergence of the model predictions and physical errors corresponding to numerical grid resolution. In this research, we optimally allocate computational resources by developing a modeling framework for the overall error based on a joint statistical and numerical analysis and optimizing the error model subject to a given computational constraint. The derived expression for the overall error explicitly takes into account the joint dependence between the discretization error of the physical space and the statistical error associated with Monte Carlo realizations. The accuracy of the proposed framework is verified in this study by applying it to several computationally extensive examples. Having this framework at hand aims hydrogeologists to achieve the optimum physical and statistical resolutions to minimize the error with a given computational budget. Moreover, the influence of the available computational resources and the geometric properties of the contaminant source zone on the optimum resolutions are investigated. We conclude that the computational cost associated with optimal allocation can be substantially reduced compared with prevalent recommendations in the literature.

  3. Hydrological modelling of the Chaohe Basin in China: Statistical model formulation and Bayesian inference

    NASA Astrophysics Data System (ADS)

    Yang, Jing; Reichert, Peter; Abbaspour, Karim C.; Yang, Hong

    2007-07-01

    SummaryCalibration of hydrologic models is very difficult because of measurement errors in input and response, errors in model structure, and the large number of non-identifiable parameters of distributed models. The difficulties even increase in arid regions with high seasonal variation of precipitation, where the modelled residuals often exhibit high heteroscedasticity and autocorrelation. On the other hand, support of water management by hydrologic models is important in arid regions, particularly if there is increasing water demand due to urbanization. The use and assessment of model results for this purpose require a careful calibration and uncertainty analysis. Extending earlier work in this field, we developed a procedure to overcome (i) the problem of non-identifiability of distributed parameters by introducing aggregate parameters and using Bayesian inference, (ii) the problem of heteroscedasticity of errors by combining a Box-Cox transformation of results and data with seasonally dependent error variances, (iii) the problems of autocorrelated errors, missing data and outlier omission with a continuous-time autoregressive error model, and (iv) the problem of the seasonal variation of error correlations with seasonally dependent characteristic correlation times. The technique was tested with the calibration of the hydrologic sub-model of the Soil and Water Assessment Tool (SWAT) in the Chaohe Basin in North China. The results demonstrated the good performance of this approach to uncertainty analysis, particularly with respect to the fulfilment of statistical assumptions of the error model. A comparison with an independent error model and with error models that only considered a subset of the suggested techniques clearly showed the superiority of the approach based on all the features (i)-(iv) mentioned above.

  4. Modeling error distributions of growth curve models through Bayesian methods.

    PubMed

    Zhang, Zhiyong

    2016-06-01

    Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.

  5. Evaluation of the prediction precision capability of partial least squares regression approach for analysis of high alloy steel by laser induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.

    2015-06-01

    Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).

  6. Asteroid orbital error analysis: Theory and application

    NASA Technical Reports Server (NTRS)

    Muinonen, K.; Bowell, Edward

    1992-01-01

    We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).

  7. Multi-Reader ROC studies with Split-Plot Designs: A Comparison of Statistical Methods

    PubMed Central

    Obuchowski, Nancy A.; Gallas, Brandon D.; Hillis, Stephen L.

    2012-01-01

    Rationale and Objectives Multi-reader imaging trials often use a factorial design, where study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of the design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper we compare three methods of analysis for the split-plot design. Materials and Methods Three statistical methods are presented: Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean ANOVA approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power and confidence interval coverage of the three test statistics. Results The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% CIs fall close to the nominal coverage for small and large sample sizes. Conclusions The split-plot MRMC study design can be statistically efficient compared with the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rate, similar power, and nominal CI coverage, are available for this study design. PMID:23122570

  8. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    PubMed Central

    Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2009-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650

  9. The Relation Between Inflation in Type-I and Type-II Error Rate and Population Divergence in Genome-Wide Association Analysis of Multi-Ethnic Populations.

    PubMed

    Derks, E M; Zwinderman, A H; Gamazon, E R

    2017-05-01

    Population divergence impacts the degree of population stratification in Genome Wide Association Studies. We aim to: (i) investigate type-I error rate as a function of population divergence (F ST ) in multi-ethnic (admixed) populations; (ii) evaluate the statistical power and effect size estimates; and (iii) investigate the impact of population stratification on the results of gene-based analyses. Quantitative phenotypes were simulated. Type-I error rate was investigated for Single Nucleotide Polymorphisms (SNPs) with varying levels of F ST between the ancestral European and African populations. Type-II error rate was investigated for a SNP characterized by a high value of F ST . In all tests, genomic MDS components were included to correct for population stratification. Type-I and type-II error rate was adequately controlled in a population that included two distinct ethnic populations but not in admixed samples. Statistical power was reduced in the admixed samples. Gene-based tests showed no residual inflation in type-I error rate.

  10. An Intuitive Graphical Approach to Understanding the Split-Plot Experiment

    ERIC Educational Resources Information Center

    Robinson, Timothy J.; Brenneman, William A.; Myers, William R.

    2009-01-01

    While split-plot designs have received considerable attention in the literature over the past decade, there seems to be a general lack of intuitive understanding of the error structure of these designs and the resulting statistical analysis. Typically, students learn the proper error terms for testing factors of a split-plot design via "expected…

  11. Laser Doppler, velocimeter system for turbine stator cascade studies and analysis of statistical biasing errors

    NASA Technical Reports Server (NTRS)

    Seasholtz, R. G.

    1977-01-01

    A laser Doppler velocimeter (LDV) built for use in the Lewis Research Center's turbine stator cascade facilities is described. The signal processing and self contained data processing are based on a computing counter. A procedure is given for mode matching the laser to the probe volume. An analysis is presented of biasing errors that were observed in turbulent flow when the mean flow was not normal to the fringes.

  12. Dealing with AFLP genotyping errors to reveal genetic structure in Plukenetia volubilis (Euphorbiaceae) in the Peruvian Amazon

    PubMed Central

    Vašek, Jakub; Viehmannová, Iva; Ocelák, Martin; Cachique Huansi, Danter; Vejl, Pavel

    2017-01-01

    An analysis of the population structure and genetic diversity for any organism often depends on one or more molecular marker techniques. Nonetheless, these techniques are not absolutely reliable because of various sources of errors arising during the genotyping process. Thus, a complex analysis of genotyping error was carried out with the AFLP method in 169 samples of the oil seed plant Plukenetia volubilis L. from small isolated subpopulations in the Peruvian Amazon. Samples were collected in nine localities from the region of San Martin. Analysis was done in eight datasets with a genotyping error from 0 to 5%. Using eleven primer combinations, 102 to 275 markers were obtained according to the dataset. It was found that it is only possible to obtain the most reliable and robust results through a multiple-level filtering process. Genotyping error and software set up influence both the estimation of population structure and genetic diversity, where in our case population number (K) varied between 2–9 depending on the dataset and statistical method used. Surprisingly, discrepancies in K number were caused more by statistical approaches than by genotyping errors themselves. However, for estimation of genetic diversity, the degree of genotyping error was critical because descriptive parameters (He, FST, PLP 5%) varied substantially (by at least 25%). Due to low gene flow, P. volubilis mostly consists of small isolated subpopulations (ΦPT = 0.252–0.323) with some degree of admixture given by socio-economic connectivity among the sites; a direct link between the genetic and geographic distances was not confirmed. The study illustrates the successful application of AFLP to infer genetic structure in non-model plants. PMID:28910307

  13. Comment on “Two statistics for evaluating parameter identifiability and error reduction” by John Doherty and Randall J. Hunt

    USGS Publications Warehouse

    Hill, Mary C.

    2010-01-01

    Doherty and Hunt (2009) present important ideas for first-order-second moment sensitivity analysis, but five issues are discussed in this comment. First, considering the composite-scaled sensitivity (CSS) jointly with parameter correlation coefficients (PCC) in a CSS/PCC analysis addresses the difficulties with CSS mentioned in the introduction. Second, their new parameter identifiability statistic actually is likely to do a poor job of parameter identifiability in common situations. The statistic instead performs the very useful role of showing how model parameters are included in the estimated singular value decomposition (SVD) parameters. Its close relation to CSS is shown. Third, the idea from p. 125 that a suitable truncation point for SVD parameters can be identified using the prediction variance is challenged using results from Moore and Doherty (2005). Fourth, the relative error reduction statistic of Doherty and Hunt is shown to belong to an emerging set of statistics here named perturbed calculated variance statistics. Finally, the perturbed calculated variance statistics OPR and PPR mentioned on p. 121 are shown to explicitly include the parameter null-space component of uncertainty. Indeed, OPR and PPR results that account for null-space uncertainty have appeared in the literature since 2000.

  14. A Study on Mutil-Scale Background Error Covariances in 3D-Var Data Assimilation

    NASA Astrophysics Data System (ADS)

    Zhang, Xubin; Tan, Zhe-Min

    2017-04-01

    The construction of background error covariances is a key component of three-dimensional variational data assimilation. There are different scale background errors and interactions among them in the numerical weather Prediction. However, the influence of these errors and their interactions cannot be represented in the background error covariances statistics when estimated by the leading methods. So, it is necessary to construct background error covariances influenced by multi-scale interactions among errors. With the NMC method, this article firstly estimates the background error covariances at given model-resolution scales. And then the information of errors whose scales are larger and smaller than the given ones is introduced respectively, using different nesting techniques, to estimate the corresponding covariances. The comparisons of three background error covariances statistics influenced by information of errors at different scales reveal that, the background error variances enhance particularly at large scales and higher levels when introducing the information of larger-scale errors by the lateral boundary condition provided by a lower-resolution model. On the other hand, the variances reduce at medium scales at the higher levels, while those show slight improvement at lower levels in the nested domain, especially at medium and small scales, when introducing the information of smaller-scale errors by nesting a higher-resolution model. In addition, the introduction of information of larger- (smaller-) scale errors leads to larger (smaller) horizontal and vertical correlation scales of background errors. Considering the multivariate correlations, the Ekman coupling increases (decreases) with the information of larger- (smaller-) scale errors included, whereas the geostrophic coupling in free atmosphere weakens in both situations. The three covariances obtained in above work are used in a data assimilation and model forecast system respectively, and then the analysis-forecast cycles for a period of 1 month are conducted. Through the comparison of both analyses and forecasts from this system, it is found that the trends for variation in analysis increments with information of different scale errors introduced are consistent with those for variation in variances and correlations of background errors. In particular, introduction of smaller-scale errors leads to larger amplitude of analysis increments for winds at medium scales at the height of both high- and low- level jet. And analysis increments for both temperature and humidity are greater at the corresponding scales at middle and upper levels under this circumstance. These analysis increments improve the intensity of jet-convection system which includes jets at different levels and coupling between them associated with latent heat release, and these changes in analyses contribute to the better forecasts for winds and temperature in the corresponding areas. When smaller-scale errors are included, analysis increments for humidity enhance significantly at large scales at lower levels to moisten southern analyses. This humidification devotes to correcting dry bias there and eventually improves forecast skill of humidity. Moreover, inclusion of larger- (smaller-) scale errors is beneficial for forecast quality of heavy (light) precipitation at large (small) scales due to the amplification (diminution) of intensity and area in precipitation forecasts but tends to overestimate (underestimate) light (heavy) precipitation .

  15. Improving Accuracy and Temporal Resolution of Learning Curve Estimation for within- and across-Session Analysis

    PubMed Central

    Tabelow, Karsten; König, Reinhard; Polzehl, Jörg

    2016-01-01

    Estimation of learning curves is ubiquitously based on proportions of correct responses within moving trial windows. Thereby, it is tacitly assumed that learning performance is constant within the moving windows, which, however, is often not the case. In the present study we demonstrate that violations of this assumption lead to systematic errors in the analysis of learning curves, and we explored the dependency of these errors on window size, different statistical models, and learning phase. To reduce these errors in the analysis of single-subject data as well as on the population level, we propose adequate statistical methods for the estimation of learning curves and the construction of confidence intervals, trial by trial. Applied to data from an avoidance learning experiment with rodents, these methods revealed performance changes occurring at multiple time scales within and across training sessions which were otherwise obscured in the conventional analysis. Our work shows that the proper assessment of the behavioral dynamics of learning at high temporal resolution can shed new light on specific learning processes, and, thus, allows to refine existing learning concepts. It further disambiguates the interpretation of neurophysiological signal changes recorded during training in relation to learning. PMID:27303809

  16. Bias and heteroscedastic memory error in self-reported health behavior: an investigation using covariance structure analysis

    PubMed Central

    Kupek, Emil

    2002-01-01

    Background Frequent use of self-reports for investigating recent and past behavior in medical research requires statistical techniques capable of analyzing complex sources of bias associated with this methodology. In particular, although decreasing accuracy of recalling more distant past events is commonplace, the bias due to differential in memory errors resulting from it has rarely been modeled statistically. Methods Covariance structure analysis was used to estimate the recall error of self-reported number of sexual partners for past periods of varying duration and its implication for the bias. Results Results indicated increasing levels of inaccuracy for reports about more distant past. Considerable positive bias was found for a small fraction of respondents who reported ten or more partners in the last year, last two years and last five years. This is consistent with the effect of heteroscedastic random error where the majority of partners had been acquired in the more distant past and therefore were recalled less accurately than the partners acquired more recently to the time of interviewing. Conclusions Memory errors of this type depend on the salience of the events recalled and are likely to be present in many areas of health research based on self-reported behavior. PMID:12435276

  17. On the Calculation of Uncertainty Statistics with Error Bounds for CFD Calculations Containing Random Parameters and Fields

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2016-01-01

    This chapter discusses the ongoing development of combined uncertainty and error bound estimates for computational fluid dynamics (CFD) calculations subject to imposed random parameters and random fields. An objective of this work is the construction of computable error bound formulas for output uncertainty statistics that guide CFD practitioners in systematically determining how accurately CFD realizations should be approximated and how accurately uncertainty statistics should be approximated for output quantities of interest. Formal error bounds formulas for moment statistics that properly account for the presence of numerical errors in CFD calculations and numerical quadrature errors in the calculation of moment statistics have been previously presented in [8]. In this past work, hierarchical node-nested dense and sparse tensor product quadratures are used to calculate moment statistics integrals. In the present work, a framework has been developed that exploits the hierarchical structure of these quadratures in order to simplify the calculation of an estimate of the quadrature error needed in error bound formulas. When signed estimates of realization error are available, this signed error may also be used to estimate output quantity of interest probability densities as a means to assess the impact of realization error on these density estimates. Numerical results are presented for CFD problems with uncertainty to demonstrate the capabilities of this framework.

  18. Is adult gait less susceptible than paediatric gait to hip joint centre regression equation error?

    PubMed

    Kiernan, D; Hosking, J; O'Brien, T

    2016-03-01

    Hip joint centre (HJC) regression equation error during paediatric gait has recently been shown to have clinical significance. In relation to adult gait, it has been inferred that comparable errors with children in absolute HJC position may in fact result in less significant kinematic and kinetic error. This study investigated the clinical agreement of three commonly used regression equation sets (Bell et al., Davis et al. and Orthotrak) for adult subjects against the equations of Harrington et al. The relationship between HJC position error and subject size was also investigated for the Davis et al. set. Full 3-dimensional gait analysis was performed on 12 healthy adult subjects with data for each set compared to Harrington et al. The Gait Profile Score, Gait Variable Score and GDI-kinetic were used to assess clinical significance while differences in HJC position between the Davis and Harrington sets were compared to leg length and subject height using regression analysis. A number of statistically significant differences were present in absolute HJC position. However, all sets fell below the clinically significant thresholds (GPS <1.6°, GDI-Kinetic <3.6 points). Linear regression revealed a statistically significant relationship for both increasing leg length and increasing subject height with decreasing error in anterior/posterior and superior/inferior directions. Results confirm a negligible clinical error for adult subjects suggesting that any of the examined sets could be used interchangeably. Decreasing error with both increasing leg length and increasing subject height suggests that the Davis set should be used cautiously on smaller subjects. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. The Accuracy of GBM GRB Localizations

    NASA Astrophysics Data System (ADS)

    Briggs, Michael Stephen; Connaughton, V.; Meegan, C.; Hurley, K.

    2010-03-01

    We report an study of the accuracy of GBM GRB localizations, analyzing three types of localizations: those produced automatically by the GBM Flight Software on board GBM, those produced automatically with ground software in near real time, and localizations produced with human guidance. The two types of automatic locations are distributed in near real-time via GCN Notices; the human-guided locations are distributed on timescale of many minutes or hours using GCN Circulars. This work uses a Bayesian analysis that models the distribution of the GBM total location error by comparing GBM locations to more accurate locations obtained with other instruments. Reference locations are obtained from Swift, Super-AGILE, the LAT, and with the IPN. We model the GBM total location errors as having systematic errors in addition to the statistical errors and use the Bayesian analysis to constrain the systematic errors.

  20. Prediction of transmission distortion for wireless video communication: analysis.

    PubMed

    Chen, Zhifeng; Wu, Dapeng

    2012-03-01

    Transmitting video over wireless is a challenging problem since video may be seriously distorted due to packet errors caused by wireless channels. The capability of predicting transmission distortion (i.e., video distortion caused by packet errors) can assist in designing video encoding and transmission schemes that achieve maximum video quality or minimum end-to-end video distortion. This paper is aimed at deriving formulas for predicting transmission distortion. The contribution of this paper is twofold. First, we identify the governing law that describes how the transmission distortion process evolves over time and analytically derive the transmission distortion formula as a closed-form function of video frame statistics, channel error statistics, and system parameters. Second, we identify, for the first time, two important properties of transmission distortion. The first property is that the clipping noise, which is produced by nonlinear clipping, causes decay of propagated error. The second property is that the correlation between motion-vector concealment error and propagated error is negative and has dominant impact on transmission distortion, compared with other correlations. Due to these two properties and elegant error/distortion decomposition, our formula provides not only more accurate prediction but also lower complexity than the existing methods.

  1. Distribution of the two-sample t-test statistic following blinded sample size re-estimation.

    PubMed

    Lu, Kaifeng

    2016-05-01

    We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

    PubMed

    Schlenker, Evelyn

    2016-01-01

    This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.

  3. Technology and medication errors: impact in nursing homes.

    PubMed

    Baril, Chantal; Gascon, Viviane; St-Pierre, Liette; Lagacé, Denis

    2014-01-01

    The purpose of this paper is to study a medication distribution technology's (MDT) impact on medication errors reported in public nursing homes in Québec Province. The work was carried out in six nursing homes (800 patients). Medication error data were collected from nursing staff through a voluntary reporting process before and after MDT was implemented. The errors were analysed using: totals errors; medication error type; severity and patient consequences. A statistical analysis verified whether there was a significant difference between the variables before and after introducing MDT. The results show that the MDT detected medication errors. The authors' analysis also indicates that errors are detected more rapidly resulting in less severe consequences for patients. MDT is a step towards safer and more efficient medication processes. Our findings should convince healthcare administrators to implement technology such as electronic prescriber or bar code medication administration systems to improve medication processes and to provide better healthcare to patients. Few studies have been carried out in long-term healthcare facilities such as nursing homes. The authors' study extends what is known about MDT's impact on medication errors in nursing homes.

  4. Errors in statistical decision making Chapter 2 in Applied Statistics in Agricultural, Biological, and Environmental Sciences

    USDA-ARS?s Scientific Manuscript database

    Agronomic and Environmental research experiments result in data that are analyzed using statistical methods. These data are unavoidably accompanied by uncertainty. Decisions about hypotheses, based on statistical analyses of these data are therefore subject to error. This error is of three types,...

  5. Simultaneous Control of Error Rates in fMRI Data Analysis

    PubMed Central

    Kang, Hakmook; Blume, Jeffrey; Ombao, Hernando; Badre, David

    2015-01-01

    The key idea of statistical hypothesis testing is to fix, and thereby control, the Type I error (false positive) rate across samples of any size. Multiple comparisons inflate the global (family-wise) Type I error rate and the traditional solution to maintaining control of the error rate is to increase the local (comparison-wise) Type II error (false negative) rates. However, in the analysis of human brain imaging data, the number of comparisons is so large that this solution breaks down: the local Type II error rate ends up being so large that scientifically meaningful analysis is precluded. Here we propose a novel solution to this problem: allow the Type I error rate to converge to zero along with the Type II error rate. It works because when the Type I error rate per comparison is very small, the accumulation (or global) Type I error rate is also small. This solution is achieved by employing the Likelihood paradigm, which uses likelihood ratios to measure the strength of evidence on a voxel-by-voxel basis. In this paper, we provide theoretical and empirical justification for a likelihood approach to the analysis of human brain imaging data. In addition, we present extensive simulations that show the likelihood approach is viable, leading to ‘cleaner’ looking brain maps and operationally superiority (lower average error rate). Finally, we include a case study on cognitive control related activation in the prefrontal cortex of the human brain. PMID:26272730

  6. Combined Uncertainty and A-Posteriori Error Bound Estimates for General CFD Calculations: Theory and Software Implementation

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2014-01-01

    This workshop presentation discusses the design and implementation of numerical methods for the quantification of statistical uncertainty, including a-posteriori error bounds, for output quantities computed using CFD methods. Hydrodynamic realizations often contain numerical error arising from finite-dimensional approximation (e.g. numerical methods using grids, basis functions, particles) and statistical uncertainty arising from incomplete information and/or statistical characterization of model parameters and random fields. The first task at hand is to derive formal error bounds for statistics given realizations containing finite-dimensional numerical error [1]. The error in computed output statistics contains contributions from both realization error and the error resulting from the calculation of statistics integrals using a numerical method. A second task is to devise computable a-posteriori error bounds by numerically approximating all terms arising in the error bound estimates. For the same reason that CFD calculations including error bounds but omitting uncertainty modeling are only of limited value, CFD calculations including uncertainty modeling but omitting error bounds are only of limited value. To gain maximum value from CFD calculations, a general software package for uncertainty quantification with quantified error bounds has been developed at NASA. The package provides implementations for a suite of numerical methods used in uncertainty quantification: Dense tensorization basis methods [3] and a subscale recovery variant [1] for non-smooth data, Sparse tensorization methods[2] utilizing node-nested hierarchies, Sampling methods[4] for high-dimensional random variable spaces.

  7. Synoptic scale forecast skill and systematic errors in the MASS 2.0 model. [Mesoscale Atmospheric Simulation System

    NASA Technical Reports Server (NTRS)

    Koch, S. E.; Skillman, W. C.; Kocin, P. J.; Wetzel, P. J.; Brill, K. F.

    1985-01-01

    The synoptic scale performance characteristics of MASS 2.0 are determined by comparing filtered 12-24 hr model forecasts to same-case forecasts made by the National Meteorological Center's synoptic-scale Limited-area Fine Mesh model. Characteristics of the two systems are contrasted, and the analysis methodology used to determine statistical skill scores and systematic errors is described. The overall relative performance of the two models in the sample is documented, and important systematic errors uncovered are presented.

  8. Using MERRA Gridded Innovations for Quantifying Uncertainties in Analysis Fields and Diagnosing Observing System Inhomogeneities

    NASA Technical Reports Server (NTRS)

    da Silva, Arlindo; Redder, Christopher

    2010-01-01

    MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.

  9. Using MERRA Gridded Innovation for Quantifying Uncertainties in Analysis Fields and Diagnosing Observing System Inhomogeneities

    NASA Astrophysics Data System (ADS)

    da Silva, A.; Redder, C. R.

    2010-12-01

    MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The Project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.

  10. Routes to failure: analysis of 41 civil aviation accidents from the Republic of China using the human factors analysis and classification system.

    PubMed

    Li, Wen-Chin; Harris, Don; Yu, Chung-San

    2008-03-01

    The human factors analysis and classification system (HFACS) is based upon Reason's organizational model of human error. HFACS was developed as an analytical framework for the investigation of the role of human error in aviation accidents, however, there is little empirical work formally describing the relationship between the components in the model. This research analyses 41 civil aviation accidents occurring to aircraft registered in the Republic of China (ROC) between 1999 and 2006 using the HFACS framework. The results show statistically significant relationships between errors at the operational level and organizational inadequacies at both the immediately adjacent level (preconditions for unsafe acts) and higher levels in the organization (unsafe supervision and organizational influences). The pattern of the 'routes to failure' observed in the data from this analysis of civil aircraft accidents show great similarities to that observed in the analysis of military accidents. This research lends further support to Reason's model that suggests that active failures are promoted by latent conditions in the organization. Statistical relationships linking fallible decisions in upper management levels were found to directly affect supervisory practices, thereby creating the psychological preconditions for unsafe acts and hence indirectly impairing the performance of pilots, ultimately leading to accidents.

  11. An analysis of the Kalman filter in the Gamma Ray Observatory (GRO) onboard attitude determination subsystem

    NASA Technical Reports Server (NTRS)

    Snow, Frank; Harman, Richard; Garrick, Joseph

    1988-01-01

    The Gamma Ray Observatory (GRO) spacecraft needs a highly accurate attitude knowledge to achieve its mission objectives. Utilizing the fixed-head star trackers (FHSTs) for observations and gyroscopes for attitude propagation, the discrete Kalman Filter processes the attitude data to obtain an onboard accuracy of 86 arc seconds (3 sigma). A combination of linear analysis and simulations using the GRO Software Simulator (GROSS) are employed to investigate the Kalman filter for stability and the effects of corrupted observations (misalignment, noise), incomplete dynamic modeling, and nonlinear errors on Kalman filter. In the simulations, on-board attitude is compared with true attitude, the sensitivity of attitude error to model errors is graphed, and a statistical analysis is performed on the residuals of the Kalman Filter. In this paper, the modeling and sensor errors that degrade the Kalman filter solution beyond mission requirements are studied, and methods are offered to identify the source of these errors.

  12. Statistically Controlling for Confounding Constructs Is Harder than You Think

    PubMed Central

    Westfall, Jacob; Yarkoni, Tal

    2016-01-01

    Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity. PMID:27031707

  13. Refractive errors in children and adolescents in Bucaramanga (Colombia).

    PubMed

    Galvis, Virgilio; Tello, Alejandro; Otero, Johanna; Serrano, Andrés A; Gómez, Luz María; Castellanos, Yuly

    2017-01-01

    The aim of this study was to establish the frequency of refractive errors in children and adolescents aged between 8 and 17 years old, living in the metropolitan area of Bucaramanga (Colombia). This study was a secondary analysis of two descriptive cross-sectional studies that applied sociodemographic surveys and assessed visual acuity and refraction. Ametropias were classified as myopic errors, hyperopic errors, and mixed astigmatism. Eyes were considered emmetropic if none of these classifications were made. The data were collated using free software and analyzed with STATA/IC 11.2. One thousand two hundred twenty-eight individuals were included in this study. Girls showed a higher rate of ametropia than boys. Hyperopic refractive errors were present in 23.1% of the subjects, and myopic errors in 11.2%. Only 0.2% of the eyes had high myopia (≤-6.00 D). Mixed astigmatism and anisometropia were uncommon, and myopia frequency increased with age. There were statistically significant steeper keratometric readings in myopic compared to hyperopic eyes. The frequency of refractive errors that we found of 36.7% is moderate compared to the global data. The rates and parameters statistically differed by sex and age groups. Our findings are useful for establishing refractive error rate benchmarks in low-middle-income countries and as a baseline for following their variation by sociodemographic factors.

  14. Knowledge of healthcare professionals about medication errors in hospitals

    PubMed Central

    Abdel-Latif, Mohamed M. M.

    2016-01-01

    Context: Medication errors are the most common types of medical errors in hospitals and leading cause of morbidity and mortality among patients. Aims: The aim of the present study was to assess the knowledge of healthcare professionals about medication errors in hospitals. Settings and Design: A self-administered questionnaire was distributed to randomly selected healthcare professionals in eight hospitals in Madinah, Saudi Arabia. Subjects and Methods: An 18-item survey was designed and comprised questions on demographic data, knowledge of medication errors, availability of reporting systems in hospitals, attitudes toward error reporting, causes of medication errors. Statistical Analysis Used: Data were analyzed with Statistical Package for the Social Sciences software Version 17. Results: A total of 323 of healthcare professionals completed the questionnaire with 64.6% response rate of 138 (42.72%) physicians, 34 (10.53%) pharmacists, and 151 (46.75%) nurses. A majority of the participants had a good knowledge about medication errors concept and their dangers on patients. Only 68.7% of them were aware of reporting systems in hospitals. Healthcare professionals revealed that there was no clear mechanism available for reporting of errors in most hospitals. Prescribing (46.5%) and administration (29%) errors were the main causes of errors. The most frequently encountered medication errors were anti-hypertensives, antidiabetics, antibiotics, digoxin, and insulin. Conclusions: This study revealed differences in the awareness among healthcare professionals toward medication errors in hospitals. The poor knowledge about medication errors emphasized the urgent necessity to adopt appropriate measures to raise awareness about medication errors in Saudi hospitals. PMID:27330261

  15. A Novel Genome-Information Content-Based Statistic for Genome-Wide Association Analysis Designed for Next-Generation Sequencing Data

    PubMed Central

    Luo, Li; Zhu, Yun

    2012-01-01

    Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812

  16. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    PubMed

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  17. Grid Resolution Study over Operability Space for a Mach 1.7 Low Boom External Compression Inlet

    NASA Technical Reports Server (NTRS)

    Anderson, Bernhard H.

    2014-01-01

    This paper presents a statistical methodology whereby the probability limits associated with CFD grid resolution of inlet flow analysis can be determined which provide quantitative information on the distribution of that error over the specified operability range. The objectives of this investigation is to quantify the effects of both random (accuracy) and systemic (biasing) errors associated with grid resolution in the analysis of the Lockheed Martin Company (LMCO) N+2 Low Boom external compression supersonic inlet. The study covers the entire operability space as defined previously by the High Speed Civil Transport (HSCT) High Speed Research (HSR) program goals. The probability limits in terms of a 95.0% confidence interval on the analysis data were evaluated for four ARP1420 inlet metrics, namely (1) total pressure recovery (PFAIP), (2) radial hub distortion (DPH/P), (3) ) radial tip distortion (DPT/P), and (4) ) circumferential distortion (DPC/P). In general, the resulting +/-0.95 delta Y interval was unacceptably large in comparison to the stated goals of the HSCT program. Therefore, the conclusion was reached that the "standard grid" size was insufficient for this type of analysis. However, in examining the statistical data, it was determined that the CFD analysis results at the outer fringes of the operability space were the determining factor in the measure of statistical uncertainty. Adequate grids are grids that are free of biasing (systemic) errors and exhibit low random (precision) errors in comparison to their operability goals. In order to be 100% certain that the operability goals have indeed been achieved for each of the inlet metrics, the Y+/-0.95 delta Y limit must fall inside the stated operability goals. For example, if the operability goal for DPC/P circumferential distortion is =0.06, then the forecast Y for DPC/P plus the 95% confidence interval on DPC/P, i.e. +/-0.95 delta Y, must all be less than or equal to 0.06.

  18. OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis.

    PubMed

    Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan

    2016-01-01

    Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.

  19. The use and misuse of statistical analyses. [in geophysics and space physics

    NASA Technical Reports Server (NTRS)

    Reiff, P. H.

    1983-01-01

    The statistical techniques most often used in space physics include Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density, and superposed epoch analysis. Tests are presented which can evaluate the significance of the results obtained through each of these. Data presented without some form of error analysis are frequently useless, since they offer no way of assessing whether a bump on a spectrum or on a superposed epoch analysis is real or merely a statistical fluctuation. Among many of the published linear correlations, for instance, the uncertainty in the intercept and slope is not given, so that the significance of the fitted parameters cannot be assessed.

  20. Statistical analysis of global horizontal solar irradiation GHI in Fez city, Morocco

    NASA Astrophysics Data System (ADS)

    Bounoua, Z.; Mechaqrane, A.

    2018-05-01

    An accurate knowledge of the solar energy reaching the ground is necessary for sizing and optimizing the performances of solar installations. This paper describes a statistical analysis of the global horizontal solar irradiation (GHI) at Fez city, Morocco. For better reliability, we have first applied a set of check procedures to test the quality of hourly GHI measurements. We then eliminate the erroneous values which are generally due to measurement or the cosine effect errors. Statistical analysis show that the annual mean daily values of GHI is of approximately 5 kWh/m²/day. Daily monthly mean values and other parameter are also calculated.

  1. Error of the slanted edge method for measuring the modulation transfer function of imaging systems.

    PubMed

    Xie, Xufen; Fan, Hongda; Wang, Hongyuan; Wang, Zebin; Zou, Nianyu

    2018-03-01

    The slanted edge method is a basic approach for measuring the modulation transfer function (MTF) of imaging systems; however, its measurement accuracy is limited in practice. Theoretical analysis of the slanted edge MTF measurement method performed in this paper reveals that inappropriate edge angles and random noise reduce this accuracy. The error caused by edge angles is analyzed using sampling and reconstruction theory. Furthermore, an error model combining noise and edge angles is proposed. We verify the analyses and model with respect to (i) the edge angle, (ii) a statistical analysis of the measurement error, (iii) the full width at half-maximum of a point spread function, and (iv) the error model. The experimental results verify the theoretical findings. This research can be referential for applications of the slanted edge MTF measurement method.

  2. Survey of editors and reviewers of high-impact psychology journals: statistical and research design problems in submitted manuscripts.

    PubMed

    Harris, Alex; Reeder, Rachelle; Hyun, Jenny

    2011-01-01

    The authors surveyed 21 editors and reviewers from major psychology journals to identify and describe the statistical and design errors they encounter most often and to get their advice regarding prevention of these problems. Content analysis of the text responses revealed themes in 3 major areas: (a) problems with research design and reporting (e.g., lack of an a priori power analysis, lack of congruence between research questions and study design/analysis, failure to adequately describe statistical procedures); (b) inappropriate data analysis (e.g., improper use of analysis of variance, too many statistical tests without adjustments, inadequate strategy for addressing missing data); and (c) misinterpretation of results. If researchers attended to these common methodological and analytic issues, the scientific quality of manuscripts submitted to high-impact psychology journals might be significantly improved.

  3. Implementation errors in the GingerALE Software: Description and recommendations.

    PubMed

    Eickhoff, Simon B; Laird, Angela R; Fox, P Mickle; Lancaster, Jack L; Fox, Peter T

    2017-01-01

    Neuroscience imaging is a burgeoning, highly sophisticated field the growth of which has been fostered by grant-funded, freely distributed software libraries that perform voxel-wise analyses in anatomically standardized three-dimensional space on multi-subject, whole-brain, primary datasets. Despite the ongoing advances made using these non-commercial computational tools, the replicability of individual studies is an acknowledged limitation. Coordinate-based meta-analysis offers a practical solution to this limitation and, consequently, plays an important role in filtering and consolidating the enormous corpus of functional and structural neuroimaging results reported in the peer-reviewed literature. In both primary data and meta-analytic neuroimaging analyses, correction for multiple comparisons is a complex but critical step for ensuring statistical rigor. Reports of errors in multiple-comparison corrections in primary-data analyses have recently appeared. Here, we report two such errors in GingerALE, a widely used, US National Institutes of Health (NIH)-funded, freely distributed software package for coordinate-based meta-analysis. These errors have given rise to published reports with more liberal statistical inferences than were specified by the authors. The intent of this technical report is threefold. First, we inform authors who used GingerALE of these errors so that they can take appropriate actions including re-analyses and corrective publications. Second, we seek to exemplify and promote an open approach to error management. Third, we discuss the implications of these and similar errors in a scientific environment dependent on third-party software. Hum Brain Mapp 38:7-11, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  4. Analyzing communication errors in an air medical transport service.

    PubMed

    Dalto, Joseph D; Weir, Charlene; Thomas, Frank

    2013-01-01

    Poor communication can result in adverse events. Presently, no standards exist for classifying and analyzing air medical communication errors. This study sought to determine the frequency and types of communication errors reported within an air medical quality and safety assurance reporting system. Of 825 quality assurance reports submitted in 2009, 278 were randomly selected and analyzed for communication errors. Each communication error was classified and mapped to Clark's communication level hierarchy (ie, levels 1-4). Descriptive statistics were performed, and comparisons were evaluated using chi-square analysis. Sixty-four communication errors were identified in 58 reports (21% of 278). Of the 64 identified communication errors, only 18 (28%) were classified by the staff to be communication errors. Communication errors occurred most often at level 1 (n = 42/64, 66%) followed by level 4 (21/64, 33%). Level 2 and 3 communication failures were rare (, 1%). Communication errors were found in a fifth of quality and safety assurance reports. The reporting staff identified less than a third of these errors. Nearly all communication errors (99%) occurred at either the lowest level of communication (level 1, 66%) or the highest level (level 4, 33%). An air medical communication ontology is necessary to improve the recognition and analysis of communication errors. Copyright © 2013 Air Medical Journal Associates. Published by Elsevier Inc. All rights reserved.

  5. A comparison of correlation-length estimation methods for the objective analysis of surface pollutants at Environment and Climate Change Canada.

    PubMed

    Ménard, Richard; Deshaies-Jacques, Martin; Gasset, Nicolas

    2016-09-01

    An objective analysis is one of the main components of data assimilation. By combining observations with the output of a predictive model we combine the best features of each source of information: the complete spatial and temporal coverage provided by models, with a close representation of the truth provided by observations. The process of combining observations with a model output is called an analysis. To produce an analysis requires the knowledge of observation and model errors, as well as its spatial correlation. This paper is devoted to the development of methods of estimation of these error variances and the characteristic length-scale of the model error correlation for its operational use in the Canadian objective analysis system. We first argue in favor of using compact support correlation functions, and then introduce three estimation methods: the Hollingsworth-Lönnberg (HL) method in local and global form, the maximum likelihood method (ML), and the [Formula: see text] diagnostic method. We perform one-dimensional (1D) simulation studies where the error variance and true correlation length are known, and perform an estimation of both error variances and correlation length where both are non-uniform. We show that a local version of the HL method can capture accurately the error variances and correlation length at each observation site, provided that spatial variability is not too strong. However, the operational objective analysis requires only a single and globally valid correlation length. We examine whether any statistics of the local HL correlation lengths could be a useful estimate, or whether other global estimation methods such as by the global HL, ML, or [Formula: see text] should be used. We found in both 1D simulation and using real data that the ML method is able to capture physically significant aspects of the correlation length, while most other estimates give unphysical and larger length-scale values. This paper describes a proposed improvement of the objective analysis of surface pollutants at Environment and Climate Change Canada (formerly known as Environment Canada). Objective analyses are essentially surface maps of air pollutants that are obtained by combining observations with an air quality model output, and are thought to provide a complete and more accurate representation of the air quality. The highlight of this study is an analysis of methods to estimate the model (or background) error correlation length-scale. The error statistics are an important and critical component to the analysis scheme.

  6. Misuse of statistical methods: critical assessment of articles in BMJ from January to March 1976.

    PubMed Central

    Gore, S M; Jones, I G; Rytter, E C

    1977-01-01

    Sixty-two reports that appeared as Papers and Originals (excluding short reports) in 13 consecutive issues of the British Medical journal included statistical analysis. Thirty-two had statistical errors of one kind or another; in 18 fairly serious faults were discovered. The summaries of five reports made some claim that was unsupportable on re-examination of the data. Medical investigators should consult with people who have a real understanding of statistical methods throughout their projects. PMID:832023

  7. Moments of inclination error distribution computer program

    NASA Technical Reports Server (NTRS)

    Myler, T. R.

    1981-01-01

    A FORTRAN coded computer program is described which calculates orbital inclination error statistics using a closed-form solution. This solution uses a data base of trajectory errors from actual flights to predict the orbital inclination error statistics. The Scott flight history data base consists of orbit insertion errors in the trajectory parameters - altitude, velocity, flight path angle, flight azimuth, latitude and longitude. The methods used to generate the error statistics are of general interest since they have other applications. Program theory, user instructions, output definitions, subroutine descriptions and detailed FORTRAN coding information are included.

  8. Analysis of Solar Spectral Irradiance Measurements from the SBUV/2-Series and the SSBUV Instruments

    NASA Technical Reports Server (NTRS)

    Cebula, Richard P.; DeLand, Matthew T.; Hilsenrath, Ernest

    1997-01-01

    During this period of performance, 1 March 1997 - 31 August 1997, the NOAA-11 SBUV/2 solar spectral irradiance data set was validated using both internal and external assessments. Initial quality checking revealed minor problems with the data (e.g. residual goniometric errors, that were manifest as differences between the two scans acquired each day). The sources of these errors were determined and the errors were corrected. Time series were constructed for selected wavelengths and the solar irradiance changes measured by the instrument were compared to a Mg II proxy-based model of short- and long-term solar irradiance variations. This analysis suggested that errors due to residual, uncorrected long-term instrument drift have been reduced to less than 1-2% over the entire 5.5 year NOAA-11 data record. Detailed statistical analysis was performed. This analysis, which will be documented in a manuscript now in preparation, conclusively demonstrates the evolution of solar rotation periodicity and strength during solar cycle 22.

  9. Precision, Reliability, and Effect Size of Slope Variance in Latent Growth Curve Models: Implications for Statistical Power Analysis

    PubMed Central

    Brandmaier, Andreas M.; von Oertzen, Timo; Ghisletta, Paolo; Lindenberger, Ulman; Hertzog, Christopher

    2018-01-01

    Latent Growth Curve Models (LGCM) have become a standard technique to model change over time. Prediction and explanation of inter-individual differences in change are major goals in lifespan research. The major determinants of statistical power to detect individual differences in change are the magnitude of true inter-individual differences in linear change (LGCM slope variance), design precision, alpha level, and sample size. Here, we show that design precision can be expressed as the inverse of effective error. Effective error is determined by instrument reliability and the temporal arrangement of measurement occasions. However, it also depends on another central LGCM component, the variance of the latent intercept and its covariance with the latent slope. We derive a new reliability index for LGCM slope variance—effective curve reliability (ECR)—by scaling slope variance against effective error. ECR is interpretable as a standardized effect size index. We demonstrate how effective error, ECR, and statistical power for a likelihood ratio test of zero slope variance formally relate to each other and how they function as indices of statistical power. We also provide a computational approach to derive ECR for arbitrary intercept-slope covariance. With practical use cases, we argue for the complementary utility of the proposed indices of a study's sensitivity to detect slope variance when making a priori longitudinal design decisions or communicating study designs. PMID:29755377

  10. Statistical approaches to account for false-positive errors in environmental DNA samples.

    PubMed

    Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid

    2016-05-01

    Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies. © 2015 John Wiley & Sons Ltd.

  11. A Categorization of Dynamic Analyzers

    NASA Technical Reports Server (NTRS)

    Lujan, Michelle R.

    1997-01-01

    Program analysis techniques and tools are essential to the development process because of the support they provide in detecting errors and deficiencies at different phases of development. The types of information rendered through analysis includes the following: statistical measurements of code, type checks, dataflow analysis, consistency checks, test data,verification of code, and debugging information. Analyzers can be broken into two major categories: dynamic and static. Static analyzers examine programs with respect to syntax errors and structural properties., This includes gathering statistical information on program content, such as the number of lines of executable code, source lines. and cyclomatic complexity. In addition, static analyzers provide the ability to check for the consistency of programs with respect to variables. Dynamic analyzers in contrast are dependent on input and the execution of a program providing the ability to find errors that cannot be detected through the use of static analysis alone. Dynamic analysis provides information on the behavior of a program rather than on the syntax. Both types of analysis detect errors in a program, but dynamic analyzers accomplish this through run-time behavior. This paper focuses on the following broad classification of dynamic analyzers: 1) Metrics; 2) Models; and 3) Monitors. Metrics are those analyzers that provide measurement. The next category, models, captures those analyzers that present the state of the program to the user at specified points in time. The last category, monitors, checks specified code based on some criteria. The paper discusses each classification and the techniques that are included under them. In addition, the role of each technique in the software life cycle is discussed. Familiarization with the tools that measure, model and monitor programs provides a framework for understanding the program's dynamic behavior from different, perspectives through analysis of the input/output data.

  12. New features added to EVALIDator: ratio estimation and county choropleth maps

    Treesearch

    Patrick D. Miles; Mark H. Hansen

    2012-01-01

    The EVALIDator Web application, developed in 2007, provides estimates and sampling errors for many user selected forest statistics from the Forest Inventory and Analysis Database (FIADB). Among the statistics estimated are forest area, number of trees, biomass, volume, growth, removals, and mortality. A new release of EVALIDator, developed in 2012, has an option to...

  13. Batch reporting of forest inventory statistics using the EVALIDator

    Treesearch

    Patrick D. Miles

    2015-01-01

    The EVALIDator Web application, developed in 2007, provides estimates and sampling errors of forest statistics (e.g., forest area, number of trees, tree biomass) from data stored in the Forest Inventory and Analysis database. In response to user demand, new features have been added to the EVALIDator. The most recent additions are 1) the ability to generate multiple...

  14. Consistent Tolerance Bounds for Statistical Distributions

    NASA Technical Reports Server (NTRS)

    Mezzacappa, M. A.

    1983-01-01

    Assumption that sample comes from population with particular distribution is made with confidence C if data lie between certain bounds. These "confidence bounds" depend on C and assumption about distribution of sampling errors around regression line. Graphical test criteria using tolerance bounds are applied in industry where statistical analysis influences product development and use. Applied to evaluate equipment life.

  15. Directional variance adjustment: bias reduction in covariance matrices based on factor analysis with an application to portfolio optimization.

    PubMed

    Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W; Müller, Klaus-Robert; Lemm, Steven

    2013-01-01

    Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation.

  16. Directional Variance Adjustment: Bias Reduction in Covariance Matrices Based on Factor Analysis with an Application to Portfolio Optimization

    PubMed Central

    Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W.; Müller, Klaus-Robert; Lemm, Steven

    2013-01-01

    Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016

  17. Phantom Effects in Multilevel Compositional Analysis: Problems and Solutions

    ERIC Educational Resources Information Center

    Pokropek, Artur

    2015-01-01

    This article combines statistical and applied research perspective showing problems that might arise when measurement error in multilevel compositional effects analysis is ignored. This article focuses on data where independent variables are constructed measures. Simulation studies are conducted evaluating methods that could overcome the…

  18. The Applicability of Standard Error of Measurement and Minimal Detectable Change to Motor Learning Research-A Behavioral Study.

    PubMed

    Furlan, Leonardo; Sterr, Annette

    2018-01-01

    Motor learning studies face the challenge of differentiating between real changes in performance and random measurement error. While the traditional p -value-based analyses of difference (e.g., t -tests, ANOVAs) provide information on the statistical significance of a reported change in performance scores, they do not inform as to the likely cause or origin of that change, that is, the contribution of both real modifications in performance and random measurement error to the reported change. One way of differentiating between real change and random measurement error is through the utilization of the statistics of standard error of measurement (SEM) and minimal detectable change (MDC). SEM is estimated from the standard deviation of a sample of scores at baseline and a test-retest reliability index of the measurement instrument or test employed. MDC, in turn, is estimated from SEM and a degree of confidence, usually 95%. The MDC value might be regarded as the minimum amount of change that needs to be observed for it to be considered a real change, or a change to which the contribution of real modifications in performance is likely to be greater than that of random measurement error. A computer-based motor task was designed to illustrate the applicability of SEM and MDC to motor learning research. Two studies were conducted with healthy participants. Study 1 assessed the test-retest reliability of the task and Study 2 consisted in a typical motor learning study, where participants practiced the task for five consecutive days. In Study 2, the data were analyzed with a traditional p -value-based analysis of difference (ANOVA) and also with SEM and MDC. The findings showed good test-retest reliability for the task and that the p -value-based analysis alone identified statistically significant improvements in performance over time even when the observed changes could in fact have been smaller than the MDC and thereby caused mostly by random measurement error, as opposed to by learning. We suggest therefore that motor learning studies could complement their p -value-based analyses of difference with statistics such as SEM and MDC in order to inform as to the likely cause or origin of any reported changes in performance.

  19. Does bad inference drive out good?

    PubMed

    Marozzi, Marco

    2015-07-01

    The (mis)use of statistics in practice is widely debated, and a field where the debate is particularly active is medicine. Many scholars emphasize that a large proportion of published medical research contains statistical errors. It has been noted that top class journals like Nature Medicine and The New England Journal of Medicine publish a considerable proportion of papers that contain statistical errors and poorly document the application of statistical methods. This paper joins the debate on the (mis)use of statistics in the medical literature. Even though the validation process of a statistical result may be quite elusive, a careful assessment of underlying assumptions is central in medicine as well as in other fields where a statistical method is applied. Unfortunately, a careful assessment of underlying assumptions is missing in many papers, including those published in top class journals. In this paper, it is shown that nonparametric methods are good alternatives to parametric methods when the assumptions for the latter ones are not satisfied. A key point to solve the problem of the misuse of statistics in the medical literature is that all journals have their own statisticians to review the statistical method/analysis section in each submitted paper. © 2015 Wiley Publishing Asia Pty Ltd.

  20. Modeling longitudinal data, I: principles of multivariate analysis.

    PubMed

    Ravani, Pietro; Barrett, Brendan; Parfrey, Patrick

    2009-01-01

    Statistical models are used to study the relationship between exposure and disease while accounting for the potential role of other factors' impact on outcomes. This adjustment is useful to obtain unbiased estimates of true effects or to predict future outcomes. Statistical models include a systematic component and an error component. The systematic component explains the variability of the response variable as a function of the predictors and is summarized in the effect estimates (model coefficients). The error element of the model represents the variability in the data unexplained by the model and is used to build measures of precision around the point estimates (confidence intervals).

  1. Model Error Estimation for the CPTEC Eta Model

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; daSilva, Arlindo

    1999-01-01

    Statistical data assimilation systems require the specification of forecast and observation error statistics. Forecast error is due to model imperfections and differences between the initial condition and the actual state of the atmosphere. Practical four-dimensional variational (4D-Var) methods try to fit the forecast state to the observations and assume that the model error is negligible. Here with a number of simplifying assumption, a framework is developed for isolating the model error given the forecast error at two lead-times. Two definitions are proposed for the Talagrand ratio tau, the fraction of the forecast error due to model error rather than initial condition error. Data from the CPTEC Eta Model running operationally over South America are used to calculate forecast error statistics and lower bounds for tau.

  2. Rate, causes and reporting of medication errors in Jordan: nurses' perspectives.

    PubMed

    Mrayyan, Majd T; Shishani, Kawkab; Al-Faouri, Ibrahim

    2007-09-01

    The aim of the study was to describe Jordanian nurses' perceptions about various issues related to medication errors. This is the first nursing study about medication errors in Jordan. This was a descriptive study. A convenient sample of 799 nurses from 24 hospitals was obtained. Descriptive and inferential statistics were used for data analysis. Over the course of their nursing career, the average number of recalled committed medication errors per nurse was 2.2. Using incident reports, the rate of medication errors reported to nurse managers was 42.1%. Medication errors occurred mainly when medication labels/packaging were of poor quality or damaged. Nurses failed to report medication errors because they were afraid that they might be subjected to disciplinary actions or even lose their jobs. In the stepwise regression model, gender was the only predictor of medication errors in Jordan. Strategies to reduce or eliminate medication errors are required.

  3. Standard deviation and standard error of the mean.

    PubMed

    Lee, Dong Kyu; In, Junyong; Lee, Sangseok

    2015-06-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results.

  4. Standard deviation and standard error of the mean

    PubMed Central

    In, Junyong; Lee, Sangseok

    2015-01-01

    In most clinical and experimental studies, the standard deviation (SD) and the estimated standard error of the mean (SEM) are used to present the characteristics of sample data and to explain statistical analysis results. However, some authors occasionally muddle the distinctive usage between the SD and SEM in medical literature. Because the process of calculating the SD and SEM includes different statistical inferences, each of them has its own meaning. SD is the dispersion of data in a normal distribution. In other words, SD indicates how accurately the mean represents sample data. However the meaning of SEM includes statistical inference based on the sampling distribution. SEM is the SD of the theoretical distribution of the sample means (the sampling distribution). While either SD or SEM can be applied to describe data and statistical results, one should be aware of reasonable methods with which to use SD and SEM. We aim to elucidate the distinctions between SD and SEM and to provide proper usage guidelines for both, which summarize data and describe statistical results. PMID:26045923

  5. Round-off errors in cutting plane algorithms based on the revised simplex procedure

    NASA Technical Reports Server (NTRS)

    Moore, J. E.

    1973-01-01

    This report statistically analyzes computational round-off errors associated with the cutting plane approach to solving linear integer programming problems. Cutting plane methods require that the inverse of a sequence of matrices be computed. The problem basically reduces to one of minimizing round-off errors in the sequence of inverses. Two procedures for minimizing this problem are presented, and their influence on error accumulation is statistically analyzed. One procedure employs a very small tolerance factor to round computed values to zero. The other procedure is a numerical analysis technique for reinverting or improving the approximate inverse of a matrix. The results indicated that round-off accumulation can be effectively minimized by employing a tolerance factor which reflects the number of significant digits carried for each calculation and by applying the reinversion procedure once to each computed inverse. If 18 significant digits plus an exponent are carried for each variable during computations, then a tolerance value of 0.1 x 10 to the minus 12th power is reasonable.

  6. Accounting for response misclassification and covariate measurement error improves power and reduces bias in epidemiologic studies.

    PubMed

    Cheng, Dunlei; Branscum, Adam J; Stamey, James D

    2010-07-01

    To quantify the impact of ignoring misclassification of a response variable and measurement error in a covariate on statistical power, and to develop software for sample size and power analysis that accounts for these flaws in epidemiologic data. A Monte Carlo simulation-based procedure is developed to illustrate the differences in design requirements and inferences between analytic methods that properly account for misclassification and measurement error to those that do not in regression models for cross-sectional and cohort data. We found that failure to account for these flaws in epidemiologic data can lead to a substantial reduction in statistical power, over 25% in some cases. The proposed method substantially reduced bias by up to a ten-fold margin compared to naive estimates obtained by ignoring misclassification and mismeasurement. We recommend as routine practice that researchers account for errors in measurement of both response and covariate data when determining sample size, performing power calculations, or analyzing data from epidemiological studies. 2010 Elsevier Inc. All rights reserved.

  7. Heterogenic Solid Biofuel Sampling Methodology and Uncertainty Associated with Prompt Analysis

    PubMed Central

    Pazó, Jose A.; Granada, Enrique; Saavedra, Ángeles; Patiño, David; Collazo, Joaquín

    2010-01-01

    Accurate determination of the properties of biomass is of particular interest in studies on biomass combustion or cofiring. The aim of this paper is to develop a methodology for prompt analysis of heterogeneous solid fuels with an acceptable degree of accuracy. Special care must be taken with the sampling procedure to achieve an acceptable degree of error and low statistical uncertainty. A sampling and error determination methodology for prompt analysis is presented and validated. Two approaches for the propagation of errors are also given and some comparisons are made in order to determine which may be better in this context. Results show in general low, acceptable levels of uncertainty, demonstrating that the samples obtained in the process are representative of the overall fuel composition. PMID:20559506

  8. Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

    PubMed Central

    Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

    2014-01-01

    A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941

  9. Model Performance Evaluation and Scenario Analysis ...

    EPA Pesticide Factsheets

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too

  10. Ranking and validation of spallation models for isotopic production cross sections of heavy residua

    NASA Astrophysics Data System (ADS)

    Sharma, Sushil K.; Kamys, Bogusław; Goldenbaum, Frank; Filges, Detlef

    2017-07-01

    The production cross sections of isotopically identified residual nuclei of spallation reactions induced by 136Xe projectiles at 500AMeV on hydrogen target were analyzed in a two-step model. The first stage of the reaction was described by the INCL4.6 model of an intranuclear cascade of nucleon-nucleon and pion-nucleon collisions whereas the second stage was analyzed by means of four different models; ABLA07, GEM2, GEMINI++ and SMM. The quality of the data description was judged quantitatively using two statistical deviation factors; the H-factor and the M-factor. It was found that the present analysis leads to a different ranking of models as compared to that obtained from the qualitative inspection of the data reproduction. The disagreement was caused by sensitivity of the deviation factors to large statistical errors present in some of the data. A new deviation factor, the A factor, was proposed, that is not sensitive to the statistical errors of the cross sections. The quantitative ranking of models performed using the A-factor agreed well with the qualitative analysis of the data. It was concluded that using the deviation factors weighted by statistical errors may lead to erroneous conclusions in the case when the data cover a large range of values. The quality of data reproduction by the theoretical models is discussed. Some systematic deviations of the theoretical predictions from the experimental results are observed.

  11. Estimations of ABL fluxes and other turbulence parameters from Doppler lidar data

    NASA Technical Reports Server (NTRS)

    Gal-Chen, Tzvi; Xu, Mei; Eberhard, Wynn

    1989-01-01

    Techniques for extraction boundary layer parameters from measurements of a short-pulse CO2 Doppler lidar are described. The measurements are those collected during the First International Satellites Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE). By continuously operating the lidar for about an hour, stable statistics of the radial velocities can be extracted. Assuming that the turbulence is horizontally homogeneous, the mean wind, its standard deviations, and the momentum fluxes were estimated. Spectral analysis of the radial velocities is also performed from which, by examining the amplitude of the power spectrum at the inertial range, the kinetic energy dissipation was deduced. Finally, using the statistical form of the Navier-Stokes equations, the surface heat flux is derived as the residual balance between the vertical gradient of the third moment of the vertical velocity and the kinetic energy dissipation. Combining many measurements would normally reduce the error provided that, it is unbiased and uncorrelated. The nature of some of the algorithms however, is such that, biased and correlated errors may be generated even though the raw measurements are not. Data processing procedures were developed that eliminate bias and minimize error correlation. Once bias and error correlations are accounted for, the large sample size is shown to reduce the errors substantially. The principal features of the derived turbulence statistics for two case studied are presented.

  12. Performance analysis of adaptive equalization for coherent acoustic communications in the time-varying ocean environment.

    PubMed

    Preisig, James C

    2005-07-01

    Equations are derived for analyzing the performance of channel estimate based equalizers. The performance is characterized in terms of the mean squared soft decision error (sigma2(s)) of each equalizer. This error is decomposed into two components. These are the minimum achievable error (sigma2(0)) and the excess error (sigma2(e)). The former is the soft decision error that would be realized by the equalizer if the filter coefficient calculation were based upon perfect knowledge of the channel impulse response and statistics of the interfering noise field. The latter is the additional soft decision error that is realized due to errors in the estimates of these channel parameters. These expressions accurately predict the equalizer errors observed in the processing of experimental data by a channel estimate based decision feedback equalizer (DFE) and a passive time-reversal equalizer. Further expressions are presented that allow equalizer performance to be predicted given the scattering function of the acoustic channel. The analysis using these expressions yields insights into the features of surface scattering that most significantly impact equalizer performance in shallow water environments and motivates the implementation of a DFE that is robust with respect to channel estimation errors.

  13. Measurement uncertainty and feasibility study of a flush airdata system for a hypersonic flight experiment

    NASA Technical Reports Server (NTRS)

    Whitmore, Stephen A.; Moes, Timothy R.

    1994-01-01

    Presented is a feasibility and error analysis for a hypersonic flush airdata system on a hypersonic flight experiment (HYFLITE). HYFLITE heating loads make intrusive airdata measurement impractical. Although this analysis is specifically for the HYFLITE vehicle and trajectory, the problems analyzed are generally applicable to hypersonic vehicles. A layout of the flush-port matrix is shown. Surface pressures are related airdata parameters using a simple aerodynamic model. The model is linearized using small perturbations and inverted using nonlinear least-squares. Effects of various error sources on the overall uncertainty are evaluated using an error simulation. Error sources modeled include boundarylayer/viscous interactions, pneumatic lag, thermal transpiration in the sensor pressure tubing, misalignment in the matrix layout, thermal warping of the vehicle nose, sampling resolution, and transducer error. Using simulated pressure data for input to the estimation algorithm, effects caused by various error sources are analyzed by comparing estimator outputs with the original trajectory. To obtain ensemble averages the simulation is run repeatedly and output statistics are compiled. Output errors resulting from the various error sources are presented as a function of Mach number. Final uncertainties with all modeled error sources included are presented as a function of Mach number.

  14. Response Surface Analysis of Experiments with Random Blocks

    DTIC Science & Technology

    1988-09-01

    partitioned into a lack of fit sum of squares, SSLOF, and a pure error sum of squares, SSPE . The latter is obtained by pooling the pure error sums of squares...from the blocks. Tests concerning the polynomial effects can then proceed using SSPE as the error term in the denominators of the F test statistics. 3.2...the center point in each of the three blocks is equal to SSPE = 2.0127 with 5 degrees of freedom. Hence, the lack of fit sum of squares is SSLoF

  15. MetaGenyo: a web tool for meta-analysis of genetic association studies.

    PubMed

    Martorell-Marugan, Jordi; Toro-Dominguez, Daniel; Alarcon-Riquelme, Marta E; Carmona-Saez, Pedro

    2017-12-16

    Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results are not always reproducible due to experimental designs, low sample sizes and other methodological errors. In this field, meta-analysis techniques are becoming very popular tools to combine results across studies to increase statistical power and to resolve discrepancies in genetic association studies. A meta-analysis summarizes research findings, increases statistical power and enables the identification of genuine associations between genotypes and phenotypes. Meta-analysis techniques are increasingly used in GAS, but it is also increasing the amount of published meta-analysis containing different errors. Although there are several software packages that implement meta-analysis, none of them are specifically designed for genetic association studies and in most cases their use requires advanced programming or scripting expertise. We have developed MetaGenyo, a web tool for meta-analysis in GAS. MetaGenyo implements a complete and comprehensive workflow that can be executed in an easy-to-use environment without programming knowledge. MetaGenyo has been developed to guide users through the main steps of a GAS meta-analysis, covering Hardy-Weinberg test, statistical association for different genetic models, analysis of heterogeneity, testing for publication bias, subgroup analysis and robustness testing of the results. MetaGenyo is a useful tool to conduct comprehensive genetic association meta-analysis. The application is freely available at http://bioinfo.genyo.es/metagenyo/ .

  16. Inherent Conservatism in Deterministic Quasi-Static Structural Analysis

    NASA Technical Reports Server (NTRS)

    Verderaime, V.

    1997-01-01

    The cause of the long-suspected excessive conservatism in the prevailing structural deterministic safety factor has been identified as an inherent violation of the error propagation laws when reducing statistical data to deterministic values and then combining them algebraically through successive structural computational processes. These errors are restricted to the applied stress computations, and because mean and variations of the tolerance limit format are added, the errors are positive, serially cumulative, and excessively conservative. Reliability methods circumvent these errors and provide more efficient and uniform safe structures. The document is a tutorial on the deficiencies and nature of the current safety factor and of its improvement and transition to absolute reliability.

  17. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Pesticide Factsheets

    The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.

  18. FAMA: Fast Automatic MOOG Analysis

    NASA Astrophysics Data System (ADS)

    Magrini, Laura; Randich, Sofia; Friel, Eileen; Spina, Lorenzo; Jacobson, Heather; Cantat-Gaudin, Tristan; Donati, Paolo; Baglioni, Roberto; Maiorca, Enrico; Bragaglia, Angela; Sordo, Rosanna; Vallenari, Antonella

    2014-02-01

    FAMA (Fast Automatic MOOG Analysis), written in Perl, computes the atmospheric parameters and abundances of a large number of stars using measurements of equivalent widths (EWs) automatically and independently of any subjective approach. Based on the widely-used MOOG code, it simultaneously searches for three equilibria, excitation equilibrium, ionization balance, and the relationship between logn(FeI) and the reduced EWs. FAMA also evaluates the statistical errors on individual element abundances and errors due to the uncertainties in the stellar parameters. Convergence criteria are not fixed "a priori" but instead are based on the quality of the spectra.

  19. Catastrophic photometric redshift errors: Weak-lensing survey requirements

    DOE PAGES

    Bernstein, Gary; Huterer, Dragan

    2010-01-11

    We study the sensitivity of weak lensing surveys to the effects of catastrophic redshift errors - cases where the true redshift is misestimated by a significant amount. To compute the biases in cosmological parameters, we adopt an efficient linearized analysis where the redshift errors are directly related to shifts in the weak lensing convergence power spectra. We estimate the number N spec of unbiased spectroscopic redshifts needed to determine the catastrophic error rate well enough that biases in cosmological parameters are below statistical errors of weak lensing tomography. While the straightforward estimate of N spec is ~10 6 we findmore » that using only the photometric redshifts with z ≤ 2.5 leads to a drastic reduction in N spec to ~ 30,000 while negligibly increasing statistical errors in dark energy parameters. Therefore, the size of spectroscopic survey needed to control catastrophic errors is similar to that previously deemed necessary to constrain the core of the z s – z p distribution. We also study the efficacy of the recent proposal to measure redshift errors by cross-correlation between the photo-z and spectroscopic samples. We find that this method requires ~ 10% a priori knowledge of the bias and stochasticity of the outlier population, and is also easily confounded by lensing magnification bias. In conclusion, the cross-correlation method is therefore unlikely to supplant the need for a complete spectroscopic redshift survey of the source population.« less

  20. A simulator for evaluating methods for the detection of lesion-deficit associations

    NASA Technical Reports Server (NTRS)

    Megalooikonomou, V.; Davatzikos, C.; Herskovits, E. H.

    2000-01-01

    Although much has been learned about the functional organization of the human brain through lesion-deficit analysis, the variety of statistical and image-processing methods developed for this purpose precludes a closed-form analysis of the statistical power of these systems. Therefore, we developed a lesion-deficit simulator (LDS), which generates artificial subjects, each of which consists of a set of functional deficits, and a brain image with lesions; the deficits and lesions conform to predefined distributions. We used probability distributions to model the number, sizes, and spatial distribution of lesions, to model the structure-function associations, and to model registration error. We used the LDS to evaluate, as examples, the effects of the complexities and strengths of lesion-deficit associations, and of registration error, on the power of lesion-deficit analysis. We measured the numbers of recovered associations from these simulated data, as a function of the number of subjects analyzed, the strengths and number of associations in the statistical model, the number of structures associated with a particular function, and the prior probabilities of structures being abnormal. The number of subjects required to recover the simulated lesion-deficit associations was found to have an inverse relationship to the strength of associations, and to the smallest probability in the structure-function model. The number of structures associated with a particular function (i.e., the complexity of associations) had a much greater effect on the performance of the analysis method than did the total number of associations. We also found that registration error of 5 mm or less reduces the number of associations discovered by approximately 13% compared to perfect registration. The LDS provides a flexible framework for evaluating many aspects of lesion-deficit analysis.

  1. Statistical Analysis of Big Data on Pharmacogenomics

    PubMed Central

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  2. Experiments with a three-dimensional statistical objective analysis scheme using FGGE data

    NASA Technical Reports Server (NTRS)

    Baker, Wayman E.; Bloom, Stephen C.; Woollen, John S.; Nestler, Mark S.; Brin, Eugenia

    1987-01-01

    A three-dimensional (3D), multivariate, statistical objective analysis scheme (referred to as optimum interpolation or OI) has been developed for use in numerical weather prediction studies with the FGGE data. Some novel aspects of the present scheme include: (1) a multivariate surface analysis over the oceans, which employs an Ekman balance instead of the usual geostrophic relationship, to model the pressure-wind error cross correlations, and (2) the capability to use an error correlation function which is geographically dependent. A series of 4-day data assimilation experiments are conducted to examine the importance of some of the key features of the OI in terms of their effects on forecast skill, as well as to compare the forecast skill using the OI with that utilizing a successive correction method (SCM) of analysis developed earlier. For the three cases examined, the forecast skill is found to be rather insensitive to varying the error correlation function geographically. However, significant differences are noted between forecasts from a two-dimensional (2D) version of the OI and those from the 3D OI, with the 3D OI forecasts exhibiting better forecast skill. The 3D OI forecasts are also more accurate than those from the SCM initial conditions. The 3D OI with the multivariate oceanic surface analysis was found to produce forecasts which were slightly more accurate, on the average, than a univariate version.

  3. Algorithm for computing descriptive statistics for very large data sets and the exa-scale era

    NASA Astrophysics Data System (ADS)

    Beekman, Izaak

    2017-11-01

    An algorithm for Single-point, Parallel, Online, Converging Statistics (SPOCS) is presented. It is suited for in situ analysis that traditionally would be relegated to post-processing, and can be used to monitor the statistical convergence and estimate the error/residual in the quantity-useful for uncertainty quantification too. Today, data may be generated at an overwhelming rate by numerical simulations and proliferating sensing apparatuses in experiments and engineering applications. Monitoring descriptive statistics in real time lets costly computations and experiments be gracefully aborted if an error has occurred, and monitoring the level of statistical convergence allows them to be run for the shortest amount of time required to obtain good results. This algorithm extends work by Pébay (Sandia Report SAND2008-6212). Pébay's algorithms are recast into a converging delta formulation, with provably favorable properties. The mean, variance, covariances and arbitrary higher order statistical moments are computed in one pass. The algorithm is tested using Sillero, Jiménez, & Moser's (2013, 2014) publicly available UPM high Reynolds number turbulent boundary layer data set, demonstrating numerical robustness, efficiency and other favorable properties.

  4. Statistical Analysis of speckle noise reduction techniques for echocardiographic Images

    NASA Astrophysics Data System (ADS)

    Saini, Kalpana; Dewal, M. L.; Rohit, Manojkumar

    2011-12-01

    Echocardiography is the safe, easy and fast technology for diagnosing the cardiac diseases. As in other ultrasound images these images also contain speckle noise. In some cases this speckle noise is useful such as in motion detection. But in general noise removal is required for better analysis of the image and proper diagnosis. Different Adaptive and anisotropic filters are included for statistical analysis. Statistical parameters such as Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Root Mean Square Error (RMSE) calculated for performance measurement. One more important aspect that there may be blurring during speckle noise removal. So it is prefered that filter should be able to enhance edges during noise removal.

  5. Multipath induced errors in meteorological Doppler/interferometer location systems

    NASA Technical Reports Server (NTRS)

    Wallace, R. G.

    1984-01-01

    One application of an RF interferometer aboard a low-orbiting spacecraft to determine the location of ground-based transmitters is in tracking high-altitude balloons for meteorological studies. A source of error in this application is reflection of the signal from the sea surface. Through propagating and signal analysis, the magnitude of the reflection-induced error in both Doppler frequency measurements and interferometer phase measurements was estimated. The theory of diffuse scattering from random surfaces was applied to obtain the power spectral density of the reflected signal. The processing of the combined direct and reflected signals was then analyzed to find the statistics of the measurement error. It was found that the error varies greatly during the satellite overpass and attains its maximum value at closest approach. The maximum values of interferometer phase error and Doppler frequency error found for the system configuration considered were comparable to thermal noise-induced error.

  6. Satellite Sampling and Retrieval Errors in Regional Monthly Rain Estimates from TMI AMSR-E, SSM/I, AMSU-B and the TRMM PR

    NASA Technical Reports Server (NTRS)

    Fisher, Brad; Wolff, David B.

    2010-01-01

    Passive and active microwave rain sensors onboard earth-orbiting satellites estimate monthly rainfall from the instantaneous rain statistics collected during satellite overpasses. It is well known that climate-scale rain estimates from meteorological satellites incur sampling errors resulting from the process of discrete temporal sampling and statistical averaging. Sampling and retrieval errors ultimately become entangled in the estimation of the mean monthly rain rate. The sampling component of the error budget effectively introduces statistical noise into climate-scale rain estimates that obscure the error component associated with the instantaneous rain retrieval. Estimating the accuracy of the retrievals on monthly scales therefore necessitates a decomposition of the total error budget into sampling and retrieval error quantities. This paper presents results from a statistical evaluation of the sampling and retrieval errors for five different space-borne rain sensors on board nine orbiting satellites. Using an error decomposition methodology developed by one of the authors, sampling and retrieval errors were estimated at 0.25 resolution within 150 km of ground-based weather radars located at Kwajalein, Marshall Islands and Melbourne, Florida. Error and bias statistics were calculated according to the land, ocean and coast classifications of the surface terrain mask developed for the Goddard Profiling (GPROF) rain algorithm. Variations in the comparative error statistics are attributed to various factors related to differences in the swath geometry of each rain sensor, the orbital and instrument characteristics of the satellite and the regional climatology. The most significant result from this study found that each of the satellites incurred negative longterm oceanic retrieval biases of 10 to 30%.

  7. Statistical Quality Control of Moisture Data in GEOS DAS

    NASA Technical Reports Server (NTRS)

    Dee, D. P.; Rukhovets, L.; Todling, R.

    1999-01-01

    A new statistical quality control algorithm was recently implemented in the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The final step in the algorithm consists of an adaptive buddy check that either accepts or rejects outlier observations based on a local statistical analysis of nearby data. A basic assumption in any such test is that the observed field is spatially coherent, in the sense that nearby data can be expected to confirm each other. However, the buddy check resulted in excessive rejection of moisture data, especially during the Northern Hemisphere summer. The analysis moisture variable in GEOS DAS is water vapor mixing ratio. Observational evidence shows that the distribution of mixing ratio errors is far from normal. Furthermore, spatial correlations among mixing ratio errors are highly anisotropic and difficult to identify. Both factors contribute to the poor performance of the statistical quality control algorithm. To alleviate the problem, we applied the buddy check to relative humidity data instead. This variable explicitly depends on temperature and therefore exhibits a much greater spatial coherence. As a result, reject rates of moisture data are much more reasonable and homogeneous in time and space.

  8. Error modeling and sensitivity analysis of a parallel robot with SCARA(selective compliance assembly robot arm) motions

    NASA Astrophysics Data System (ADS)

    Chen, Yuzhen; Xie, Fugui; Liu, Xinjun; Zhou, Yanhua

    2014-07-01

    Parallel robots with SCARA(selective compliance assembly robot arm) motions are utilized widely in the field of high speed pick-and-place manipulation. Error modeling for these robots generally simplifies the parallelogram structures included by the robots as a link. As the established error model fails to reflect the error feature of the parallelogram structures, the effect of accuracy design and kinematic calibration based on the error model come to be undermined. An error modeling methodology is proposed to establish an error model of parallel robots with parallelogram structures. The error model can embody the geometric errors of all joints, including the joints of parallelogram structures. Thus it can contain more exhaustively the factors that reduce the accuracy of the robot. Based on the error model and some sensitivity indices defined in the sense of statistics, sensitivity analysis is carried out. Accordingly, some atlases are depicted to express each geometric error's influence on the moving platform's pose errors. From these atlases, the geometric errors that have greater impact on the accuracy of the moving platform are identified, and some sensitive areas where the pose errors of the moving platform are extremely sensitive to the geometric errors are also figured out. By taking into account the error factors which are generally neglected in all existing modeling methods, the proposed modeling method can thoroughly disclose the process of error transmission and enhance the efficacy of accuracy design and calibration.

  9. Uncertainty Analysis and Order-by-Order Optimization of Chiral Nuclear Interactions

    DOE PAGES

    Carlsson, Boris; Forssen, Christian; Fahlin Strömberg, D.; ...

    2016-02-24

    Chiral effective field theory ( ΧEFT) provides a systematic approach to describe low-energy nuclear forces. Moreover, EFT is able to provide well-founded estimates of statistical and systematic uncertainties | although this unique advantage has not yet been fully exploited. We ll this gap by performing an optimization and statistical analysis of all the low-energy constants (LECs) up to next-to-next-to-leading order. Our optimization protocol corresponds to a simultaneous t to scattering and bound-state observables in the pion-nucleon, nucleon-nucleon, and few-nucleon sectors, thereby utilizing the full model capabilities of EFT. Finally, we study the effect on other observables by demonstrating forward-error-propagation methodsmore » that can easily be adopted by future works. We employ mathematical optimization and implement automatic differentiation to attain e cient and machine-precise first- and second-order derivatives of the objective function with respect to the LECs. This is also vital for the regression analysis. We use power-counting arguments to estimate the systematic uncertainty that is inherent to EFT and we construct chiral interactions at different orders with quantified uncertainties. Statistical error propagation is compared with Monte Carlo sampling showing that statistical errors are in general small compared to systematic ones. In conclusion, we find that a simultaneous t to different sets of data is critical to (i) identify the optimal set of LECs, (ii) capture all relevant correlations, (iii) reduce the statistical uncertainty, and (iv) attain order-by-order convergence in EFT. Furthermore, certain systematic uncertainties in the few-nucleon sector are shown to get substantially magnified in the many-body sector; in particlar when varying the cutoff in the chiral potentials. The methodology and results presented in this Paper open a new frontier for uncertainty quantification in ab initio nuclear theory.« less

  10. Alaska national hydrography dataset positional accuracy assessment study

    USGS Publications Warehouse

    Arundel, Samantha; Yamamoto, Kristina H.; Constance, Eric; Mantey, Kim; Vinyard-Houx, Jeremy

    2013-01-01

    Initial visual assessments Wide range in the quality of fit between features in NHD and these new image sources. No statistical analysis has been performed to actually quantify accuracy Determining absolute accuracy is cost prohibitive (must collect independent, well defined test points) Quantitative analysis of relative positional error is feasible.

  11. Scripts for TRUMP data analyses. Part II (HLA-related data): statistical analyses specific for hematopoietic stem cell transplantation.

    PubMed

    Kanda, Junya

    2016-01-01

    The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.

  12. Linear error analysis of slope-area discharge determinations

    USGS Publications Warehouse

    Kirby, W.H.

    1987-01-01

    The slope-area method can be used to calculate peak flood discharges when current-meter measurements are not possible. This calculation depends on several quantities, such as water-surface fall, that are subject to large measurement errors. Other critical quantities, such as Manning's n, are not even amenable to direct measurement but can only be estimated. Finally, scour and fill may cause gross discrepancies between the observed condition of the channel and the hydraulic conditions during the flood peak. The effects of these potential errors on the accuracy of the computed discharge have been estimated by statistical error analysis using a Taylor-series approximation of the discharge formula and the well-known formula for the variance of a sum of correlated random variates. The resultant error variance of the computed discharge is a weighted sum of covariances of the various observational errors. The weights depend on the hydraulic and geometric configuration of the channel. The mathematical analysis confirms the rule of thumb that relative errors in computed discharge increase rapidly when velocity heads exceed the water-surface fall, when the flow field is expanding and when lateral velocity variation (alpha) is large. It also confirms the extreme importance of accurately assessing the presence of scour or fill. ?? 1987.

  13. Performance of Modified Test Statistics in Covariance and Correlation Structure Analysis under Conditions of Multivariate Nonnormality.

    ERIC Educational Resources Information Center

    Fouladi, Rachel T.

    2000-01-01

    Provides an overview of standard and modified normal theory and asymptotically distribution-free covariance and correlation structure analysis techniques and details Monte Carlo simulation results on Type I and Type II error control. Demonstrates through the simulation that robustness and nonrobustness of structure analysis techniques vary as a…

  14. Soil moisture assimilation using a modified ensemble transform Kalman filter with water balance constraint

    NASA Astrophysics Data System (ADS)

    Wu, Guocan; Zheng, Xiaogu; Dan, Bo

    2016-04-01

    The shallow soil moisture observations are assimilated into Common Land Model (CoLM) to estimate the soil moisture in different layers. The forecast error is inflated to improve the analysis state accuracy and the water balance constraint is adopted to reduce the water budget residual in the assimilation procedure. The experiment results illustrate that the adaptive forecast error inflation can reduce the analysis error, while the proper inflation layer can be selected based on the -2log-likelihood function of the innovation statistic. The water balance constraint can result in reducing water budget residual substantially, at a low cost of assimilation accuracy loss. The assimilation scheme can be potentially applied to assimilate the remote sensing data.

  15. Measurement error in time-series analysis: a simulation study comparing modelled and monitored data.

    PubMed

    Butland, Barbara K; Armstrong, Ben; Atkinson, Richard W; Wilkinson, Paul; Heal, Mathew R; Doherty, Ruth M; Vieno, Massimo

    2013-11-13

    Assessing health effects from background exposure to air pollution is often hampered by the sparseness of pollution monitoring networks. However, regional atmospheric chemistry-transport models (CTMs) can provide pollution data with national coverage at fine geographical and temporal resolution. We used statistical simulation to compare the impact on epidemiological time-series analysis of additive measurement error in sparse monitor data as opposed to geographically and temporally complete model data. Statistical simulations were based on a theoretical area of 4 regions each consisting of twenty-five 5 km × 5 km grid-squares. In the context of a 3-year Poisson regression time-series analysis of the association between mortality and a single pollutant, we compared the error impact of using daily grid-specific model data as opposed to daily regional average monitor data. We investigated how this comparison was affected if we changed the number of grids per region containing a monitor. To inform simulations, estimates (e.g. of pollutant means) were obtained from observed monitor data for 2003-2006 for national network sites across the UK and corresponding model data that were generated by the EMEP-WRF CTM. Average within-site correlations between observed monitor and model data were 0.73 and 0.76 for rural and urban daily maximum 8-hour ozone respectively, and 0.67 and 0.61 for rural and urban loge(daily 1-hour maximum NO2). When regional averages were based on 5 or 10 monitors per region, health effect estimates exhibited little bias. However, with only 1 monitor per region, the regression coefficient in our time-series analysis was attenuated by an estimated 6% for urban background ozone, 13% for rural ozone, 29% for urban background loge(NO2) and 38% for rural loge(NO2). For grid-specific model data the corresponding figures were 19%, 22%, 54% and 44% respectively, i.e. similar for rural loge(NO2) but more marked for urban loge(NO2). Even if correlations between model and monitor data appear reasonably strong, additive classical measurement error in model data may lead to appreciable bias in health effect estimates. As process-based air pollution models become more widely used in epidemiological time-series analysis, assessments of error impact that include statistical simulation may be useful.

  16. Attitudes of Mashhad Public Hospital's Nurses and Midwives toward the Causes and Rates of Medical Errors Reporting.

    PubMed

    Mobarakabadi, Sedigheh Sedigh; Ebrahimipour, Hosein; Najar, Ali Vafaie; Janghorban, Roksana; Azarkish, Fatemeh

    2017-03-01

    Patient's safety is one of the main objective in healthcare services; however medical errors are a prevalent potential occurrence for the patients in treatment systems. Medical errors lead to an increase in mortality rate of the patients and challenges such as prolonging of the inpatient period in the hospitals and increased cost. Controlling the medical errors is very important, because these errors besides being costly, threaten the patient's safety. To evaluate the attitudes of nurses and midwives toward the causes and rates of medical errors reporting. It was a cross-sectional observational study. The study population was 140 midwives and nurses employed in Mashhad Public Hospitals. The data collection was done through Goldstone 2001 revised questionnaire. SPSS 11.5 software was used for data analysis. To analyze data, descriptive and inferential analytic statistics were used. Standard deviation and relative frequency distribution, descriptive statistics were used for calculation of the mean and the results were adjusted as tables and charts. Chi-square test was used for the inferential analysis of the data. Most of midwives and nurses (39.4%) were in age range of 25 to 34 years and the lowest percentage (2.2%) were in age range of 55-59 years. The highest average of medical errors was related to employees with three-four years of work experience, while the lowest average was related to those with one-two years of work experience. The highest average of medical errors was during the evening shift, while the lowest were during the night shift. Three main causes of medical errors were considered: illegibile physician prescription orders, similarity of names in different drugs and nurse fatigueness. The most important causes for medical errors from the viewpoints of nurses and midwives are illegible physician's order, drug name similarity with other drugs, nurse's fatigueness and damaged label or packaging of the drug, respectively. Head nurse feedback, peer feedback, fear of punishment or job loss were considered as reasons for under reporting of medical errors. This research demonstrates the need for greater attention to be paid to the causes of medical errors.

  17. [Errors in laboratory daily practice].

    PubMed

    Larrose, C; Le Carrer, D

    2007-01-01

    Legislation set by GBEA (Guide de bonne exécution des analyses) requires that, before performing analysis, the laboratory directors have to check both the nature of the samples and the patients identity. The data processing of requisition forms, which identifies key errors, was established in 2000 and in 2002 by the specialized biochemistry laboratory, also with the contribution of the reception centre for biological samples. The laboratories follow a strict criteria of defining acceptability as a starting point for the reception to then check requisition forms and biological samples. All errors are logged into the laboratory database and analysis report are sent to the care unit specifying the problems and the consequences they have on the analysis. The data is then assessed by the laboratory directors to produce monthly or annual statistical reports. This indicates the number of errors, which are then indexed to patient files to reveal the specific problem areas, therefore allowing the laboratory directors to teach the nurses and enable corrective action.

  18. Investigation of Error Patterns in Geographical Databases

    NASA Technical Reports Server (NTRS)

    Dryer, David; Jacobs, Derya A.; Karayaz, Gamze; Gronbech, Chris; Jones, Denise R. (Technical Monitor)

    2002-01-01

    The objective of the research conducted in this project is to develop a methodology to investigate the accuracy of Airport Safety Modeling Data (ASMD) using statistical, visualization, and Artificial Neural Network (ANN) techniques. Such a methodology can contribute to answering the following research questions: Over a representative sampling of ASMD databases, can statistical error analysis techniques be accurately learned and replicated by ANN modeling techniques? This representative ASMD sample should include numerous airports and a variety of terrain characterizations. Is it possible to identify and automate the recognition of patterns of error related to geographical features? Do such patterns of error relate to specific geographical features, such as elevation or terrain slope? Is it possible to combine the errors in small regions into an error prediction for a larger region? What are the data density reduction implications of this work? ASMD may be used as the source of terrain data for a synthetic visual system to be used in the cockpit of aircraft when visual reference to ground features is not possible during conditions of marginal weather or reduced visibility. In this research, United States Geologic Survey (USGS) digital elevation model (DEM) data has been selected as the benchmark. Artificial Neural Networks (ANNS) have been used and tested as alternate methods in place of the statistical methods in similar problems. They often perform better in pattern recognition, prediction and classification and categorization problems. Many studies show that when the data is complex and noisy, the accuracy of ANN models is generally higher than those of comparable traditional methods.

  19. Cost-Effectiveness Analysis of an Automated Medication System Implemented in a Danish Hospital Setting.

    PubMed

    Risør, Bettina Wulff; Lisby, Marianne; Sørensen, Jan

    To evaluate the cost-effectiveness of an automated medication system (AMS) implemented in a Danish hospital setting. An economic evaluation was performed alongside a controlled before-and-after effectiveness study with one control ward and one intervention ward. The primary outcome measure was the number of errors in the medication administration process observed prospectively before and after implementation. To determine the difference in proportion of errors after implementation of the AMS, logistic regression was applied with the presence of error(s) as the dependent variable. Time, group, and interaction between time and group were the independent variables. The cost analysis used the hospital perspective with a short-term incremental costing approach. The total 6-month costs with and without the AMS were calculated as well as the incremental costs. The number of avoided administration errors was related to the incremental costs to obtain the cost-effectiveness ratio expressed as the cost per avoided administration error. The AMS resulted in a statistically significant reduction in the proportion of errors in the intervention ward compared with the control ward. The cost analysis showed that the AMS increased the ward's 6-month cost by €16,843. The cost-effectiveness ratio was estimated at €2.01 per avoided administration error, €2.91 per avoided procedural error, and €19.38 per avoided clinical error. The AMS was effective in reducing errors in the medication administration process at a higher overall cost. The cost-effectiveness analysis showed that the AMS was associated with affordable cost-effectiveness rates. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  20. [Practical aspects regarding sample size in clinical research].

    PubMed

    Vega Ramos, B; Peraza Yanes, O; Herrera Correa, G; Saldívar Toraya, S

    1996-01-01

    The knowledge of the right sample size let us to be sure if the published results in medical papers had a suitable design and a proper conclusion according to the statistics analysis. To estimate the sample size we must consider the type I error, type II error, variance, the size of the effect, significance and power of the test. To decide what kind of mathematics formula will be used, we must define what kind of study we have, it means if its a prevalence study, a means values one or a comparative one. In this paper we explain some basic topics of statistics and we describe four simple samples of estimation of sample size.

  1. Statistical methodology for estimating the mean difference in a meta-analysis without study-specific variance information.

    PubMed

    Sangnawakij, Patarawan; Böhning, Dankmar; Adams, Stephen; Stanton, Michael; Holling, Heinz

    2017-04-30

    Statistical inference for analyzing the results from several independent studies on the same quantity of interest has been investigated frequently in recent decades. Typically, any meta-analytic inference requires that the quantity of interest is available from each study together with an estimate of its variability. The current work is motivated by a meta-analysis on comparing two treatments (thoracoscopic and open) of congenital lung malformations in young children. Quantities of interest include continuous end-points such as length of operation or number of chest tube days. As studies only report mean values (and no standard errors or confidence intervals), the question arises how meta-analytic inference can be developed. We suggest two methods to estimate study-specific variances in such a meta-analysis, where only sample means and sample sizes are available in the treatment arms. A general likelihood ratio test is derived for testing equality of variances in two groups. By means of simulation studies, the bias and estimated standard error of the overall mean difference from both methodologies are evaluated and compared with two existing approaches: complete study analysis only and partial variance information. The performance of the test is evaluated in terms of type I error. Additionally, we illustrate these methods in the meta-analysis on comparing thoracoscopic and open surgery for congenital lung malformations and in a meta-analysis on the change in renal function after kidney donation. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. A shift from significance test to hypothesis test through power analysis in medical research.

    PubMed

    Singh, G

    2006-01-01

    Medical research literature until recently, exhibited substantial dominance of the Fisher's significance test approach of statistical inference concentrating more on probability of type I error over Neyman-Pearson's hypothesis test considering both probability of type I and II error. Fisher's approach dichotomises results into significant or not significant results with a P value. The Neyman-Pearson's approach talks of acceptance or rejection of null hypothesis. Based on the same theory these two approaches deal with same objective and conclude in their own way. The advancement in computing techniques and availability of statistical software have resulted in increasing application of power calculations in medical research and thereby reporting the result of significance tests in the light of power of the test also. Significance test approach, when it incorporates power analysis contains the essence of hypothesis test approach. It may be safely argued that rising application of power analysis in medical research may have initiated a shift from Fisher's significance test to Neyman-Pearson's hypothesis test procedure.

  3. Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves.

    PubMed

    Ernst, Dominique; Köhler, Jürgen

    2013-01-21

    We provide experimental results on the accuracy of diffusion coefficients obtained by a mean squared displacement (MSD) analysis of single-particle trajectories. We have recorded very long trajectories comprising more than 1.5 × 10(5) data points and decomposed these long trajectories into shorter segments providing us with ensembles of trajectories of variable lengths. This enabled a statistical analysis of the resulting MSD curves as a function of the lengths of the segments. We find that the relative error of the diffusion coefficient can be minimized by taking an optimum number of points into account for fitting the MSD curves, and that this optimum does not depend on the segment length. Yet, the magnitude of the relative error for the diffusion coefficient does, and achieving an accuracy in the order of 10% requires the recording of trajectories with about 1000 data points. Finally, we compare our results with theoretical predictions and find very good qualitative and quantitative agreement between experiment and theory.

  4. Sex differences in discriminative power of volleyball game-related statistics.

    PubMed

    João, Paulo Vicente; Leite, Nuno; Mesquita, Isabel; Sampaio, Jaime

    2010-12-01

    To identify sex differences in volleyball game-related statistics, the game-related statistics of several World Championships in 2007 (N=132) were analyzed using the software VIS from the International Volleyball Federation. Discriminant analysis was used to identify the game-related statistics which better discriminated performances by sex. Analysis yielded an emphasis on fault serves (SC = -.40), shot spikes (SC = .40), and reception digs (SC = .31). Specific robust numbers represent that considerable variability was evident in the game-related statistics profile, as men's volleyball games were better associated with terminal actions (errors of service), and women's volleyball games were characterized by continuous actions (in defense and attack). These differences may be related to the anthropometric and physiological differences between women and men and their influence on performance profiles.

  5. Clinical and Radiographic Evaluation of Procedural Errors during Preparation of Curved Root Canals with Hand and Rotary Instruments: A Randomized Clinical Study

    PubMed Central

    Khanna, Rajesh; Handa, Aashish; Virk, Rupam Kaur; Ghai, Deepika; Handa, Rajni Sharma; Goel, Asim

    2017-01-01

    Background: The process of cleaning and shaping the canal is not an easy goal to obtain, as canal curvature played a significant role during the instrumentation of the curved canals. Aim: The present in vivo study was conducted to evaluate procedural errors during the preparation of curved root canals using hand Nitiflex and rotary K3XF instruments. Materials and Methods: Procedural errors such as ledge formation, instrument separation, and perforation (apical, furcal, strip) were determined in sixty patients, divided into two groups. In Group I, thirty teeth in thirty patients were prepared using hand Nitiflex system, and in Group II, thirty teeth in thirty patients were prepared using K3XF rotary system. The evaluation was done clinically as well as radiographically. The results recorded from both groups were compiled and put to statistical analysis. Statistical Analysis: Chi-square test was used to compare the procedural errors (instrument separation, ledge formation, and perforation). Results: In the present study, both hand Nitiflex and rotary K3XF showed ledge formation and instrument separation. Although ledge formation and instrument separation by rotary K3XF file system was less as compared to hand Nitiflex. No perforation was seen in both the instrument groups. Conclusion: Canal curvature played a significant role during the instrumentation of the curved canals. Procedural errors such as ledge formation and instrument separation by rotary K3XF file system were less as compared to hand Nitiflex. PMID:29042727

  6. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    PubMed

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  7. Consistency errors in p-values reported in Spanish psychology journals.

    PubMed

    Caperos, José Manuel; Pardo, Antonio

    2013-01-01

    Recent reviews have drawn attention to frequent consistency errors when reporting statistical results. We have reviewed the statistical results reported in 186 articles published in four Spanish psychology journals. Of these articles, 102 contained at least one of the statistics selected for our study: Fisher-F , Student-t and Pearson-c 2 . Out of the 1,212 complete statistics reviewed, 12.2% presented a consistency error, meaning that the reported p-value did not correspond to the reported value of the statistic and its degrees of freedom. In 2.3% of the cases, the correct calculation would have led to a different conclusion than the reported one. In terms of articles, 48% included at least one consistency error, and 17.6% would have to change at least one conclusion. In meta-analytical terms, with a focus on effect size, consistency errors can be considered substantial in 9.5% of the cases. These results imply a need to improve the quality and precision with which statistical results are reported in Spanish psychology journals.

  8. A practical method of estimating standard error of age in the fission track dating method

    USGS Publications Warehouse

    Johnson, N.M.; McGee, V.E.; Naeser, C.W.

    1979-01-01

    A first-order approximation formula for the propagation of error in the fission track age equation is given by PA = C[P2s+P2i+P2??-2rPsPi] 1 2, where PA, Ps, Pi and P?? are the percentage error of age, of spontaneous track density, of induced track density, and of neutron dose, respectively, and C is a constant. The correlation, r, between spontaneous are induced track densities is a crucial element in the error analysis, acting generally to improve the standard error of age. In addition, the correlation parameter r is instrumental is specifying the level of neutron dose, a controlled variable, which will minimize the standard error of age. The results from the approximation equation agree closely with the results from an independent statistical model for the propagation of errors in the fission-track dating method. ?? 1979.

  9. Structural interpretation in composite systems using powder X-ray diffraction: applications of error propagation to the pair distribution function.

    PubMed

    Moore, Michael D; Shi, Zhenqi; Wildfong, Peter L D

    2010-12-01

    To develop a method for drawing statistical inferences from differences between multiple experimental pair distribution function (PDF) transforms of powder X-ray diffraction (PXRD) data. The appropriate treatment of initial PXRD error estimates using traditional error propagation algorithms was tested using Monte Carlo simulations on amorphous ketoconazole. An amorphous felodipine:polyvinyl pyrrolidone:vinyl acetate (PVPva) physical mixture was prepared to define an error threshold. Co-solidified products of felodipine:PVPva and terfenadine:PVPva were prepared using a melt-quench method and subsequently analyzed using PXRD and PDF. Differential scanning calorimetry (DSC) was used as an additional characterization method. The appropriate manipulation of initial PXRD error estimates through the PDF transform were confirmed using the Monte Carlo simulations for amorphous ketoconazole. The felodipine:PVPva physical mixture PDF analysis determined ±3σ to be an appropriate error threshold. Using the PDF and error propagation principles, the felodipine:PVPva co-solidified product was determined to be completely miscible, and the terfenadine:PVPva co-solidified product, although having appearances of an amorphous molecular solid dispersion by DSC, was determined to be phase-separated. Statistically based inferences were successfully drawn from PDF transforms of PXRD patterns obtained from composite systems. The principles applied herein may be universally adapted to many different systems and provide a fundamentally sound basis for drawing structural conclusions from PDF studies.

  10. Performance analysis of different tuning rules for an isothermal CSTR using integrated EPC and SPC

    NASA Astrophysics Data System (ADS)

    Roslan, A. H.; Karim, S. F. Abd; Hamzah, N.

    2018-03-01

    This paper demonstrates the integration of Engineering Process Control (EPC) and Statistical Process Control (SPC) for the control of product concentration of an isothermal CSTR. The objectives of this study are to evaluate the performance of Ziegler-Nichols (Z-N), Direct Synthesis, (DS) and Internal Model Control (IMC) tuning methods and determine the most effective method for this process. The simulation model was obtained from past literature and re-constructed using SIMULINK MATLAB to evaluate the process response. Additionally, the process stability, capability and normality were analyzed using Process Capability Sixpack reports in Minitab. Based on the results, DS displays the best response for having the smallest rise time, settling time, overshoot, undershoot, Integral Time Absolute Error (ITAE) and Integral Square Error (ISE). Also, based on statistical analysis, DS yields as the best tuning method as it exhibits the highest process stability and capability.

  11. Phase noise optimization in temporal phase-shifting digital holography with partial coherence light sources and its application in quantitative cell imaging.

    PubMed

    Remmersmann, Christian; Stürwald, Stephan; Kemper, Björn; Langehanenberg, Patrik; von Bally, Gert

    2009-03-10

    In temporal phase-shifting-based digital holographic microscopy, high-resolution phase contrast imaging requires optimized conditions for hologram recording and phase retrieval. To optimize the phase resolution, for the example of a variable three-step algorithm, a theoretical analysis on statistical errors, digitalization errors, uncorrelated errors, and errors due to a misaligned temporal phase shift is carried out. In a second step the theoretically predicted results are compared to the measured phase noise obtained from comparative experimental investigations with several coherent and partially coherent light sources. Finally, the applicability for noise reduction is demonstrated by quantitative phase contrast imaging of pancreas tumor cells.

  12. Impact of Communication Errors in Radiology on Patient Care, Customer Satisfaction, and Work-Flow Efficiency.

    PubMed

    Siewert, Bettina; Brook, Olga R; Hochman, Mary; Eisenberg, Ronald L

    2016-03-01

    The purpose of this study is to analyze the impact of communication errors on patient care, customer satisfaction, and work-flow efficiency and to identify opportunities for quality improvement. We performed a search of our quality assurance database for communication errors submitted from August 1, 2004, through December 31, 2014. Cases were analyzed regarding the step in the imaging process at which the error occurred (i.e., ordering, scheduling, performance of examination, study interpretation, or result communication). The impact on patient care was graded on a 5-point scale from none (0) to catastrophic (4). The severity of impact between errors in result communication and those that occurred at all other steps was compared. Error evaluation was performed independently by two board-certified radiologists. Statistical analysis was performed using the chi-square test and kappa statistics. Three hundred eighty of 422 cases were included in the study. One hundred ninety-nine of the 380 communication errors (52.4%) occurred at steps other than result communication, including ordering (13.9%; n = 53), scheduling (4.7%; n = 18), performance of examination (30.0%; n = 114), and study interpretation (3.7%; n = 14). Result communication was the single most common step, accounting for 47.6% (181/380) of errors. There was no statistically significant difference in impact severity between errors that occurred during result communication and those that occurred at other times (p = 0.29). In 37.9% of cases (144/380), there was an impact on patient care, including 21 minor impacts (5.5%; result communication, n = 13; all other steps, n = 8), 34 moderate impacts (8.9%; result communication, n = 12; all other steps, n = 22), and 89 major impacts (23.4%; result communication, n = 45; all other steps, n = 44). In 62.1% (236/380) of cases, no impact was noted, but 52.6% (200/380) of cases had the potential for an impact. Among 380 communication errors in a radiology department, 37.9% had a direct impact on patient care, with an additional 52.6% having a potential impact. Most communication errors (52.4%) occurred at steps other than result communication, with similar severity of impact.

  13. Error quantification of a high-resolution coupled hydrodynamic-ecosystem coastal-ocean model: Part 2. Chlorophyll-a, nutrients and SPM

    NASA Astrophysics Data System (ADS)

    Allen, J. Icarus; Holt, Jason T.; Blackford, Jerry; Proctor, Roger

    2007-12-01

    Marine systems models are becoming increasingly complex and sophisticated, but far too little attention has been paid to model errors and the extent to which model outputs actually relate to ecosystem processes. Here we describe the application of summary error statistics to a complex 3D model (POLCOMS-ERSEM) run for the period 1988-1989 in the southern North Sea utilising information from the North Sea Project, which collected a wealth of observational data. We demonstrate that to understand model data misfit and the mechanisms creating errors, we need to use a hierarchy of techniques, including simple correlations, model bias, model efficiency, binary discriminator analysis and the distribution of model errors to assess model errors spatially and temporally. We also demonstrate that a linear cost function is an inappropriate measure of misfit. This analysis indicates that the model has some skill for all variables analysed. A summary plot of model performance indicates that model performance deteriorates as we move through the ecosystem from the physics, to the nutrients and plankton.

  14. Cluster size statistic and cluster mass statistic: two novel methods for identifying changes in functional connectivity between groups or conditions.

    PubMed

    Ing, Alex; Schwarzbauer, Christian

    2014-01-01

    Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.

  15. Cluster Size Statistic and Cluster Mass Statistic: Two Novel Methods for Identifying Changes in Functional Connectivity Between Groups or Conditions

    PubMed Central

    Ing, Alex; Schwarzbauer, Christian

    2014-01-01

    Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods – the cluster size statistic (CSS) and cluster mass statistic (CMS) – are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity. PMID:24906136

  16. On the application of the Principal Component Analysis for an efficient climate downscaling of surface wind fields

    NASA Astrophysics Data System (ADS)

    Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver

    2013-04-01

    With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.

  17. Spatial uncertainty of a geoid undulation model in Guayaquil, Ecuador

    NASA Astrophysics Data System (ADS)

    Chicaiza, E. G.; Leiva, C. A.; Arranz, J. J.; Buenańo, X. E.

    2017-06-01

    Geostatistics is a discipline that deals with the statistical analysis of regionalized variables. In this case study, geostatistics is used to estimate geoid undulation in the rural area of Guayaquil town in Ecuador. The geostatistical approach was chosen because the estimation error of prediction map is getting. Open source statistical software R and mainly geoR, gstat and RGeostats libraries were used. Exploratory data analysis (EDA), trend and structural analysis were carried out. An automatic model fitting by Iterative Least Squares and other fitting procedures were employed to fit the variogram. Finally, Kriging using gravity anomaly of Bouguer as external drift and Universal Kriging were used to get a detailed map of geoid undulation. The estimation uncertainty was reached in the interval [-0.5; +0.5] m for errors and a maximum estimation standard deviation of 2 mm in relation with the method of interpolation applied. The error distribution of the geoid undulation map obtained in this study provides a better result than Earth gravitational models publicly available for the study area according the comparison with independent validation points. The main goal of this paper is to confirm the feasibility to use geoid undulations from Global Navigation Satellite Systems and leveling field measurements and geostatistical techniques methods in order to use them in high-accuracy engineering projects.

  18. The High Cost of Complexity in Experimental Design and Data Analysis: Type I and Type II Error Rates in Multiway ANOVA.

    ERIC Educational Resources Information Center

    Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.

    2002-01-01

    Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…

  19. An Application of M[subscript 2] Statistic to Evaluate the Fit of Cognitive Diagnostic Models

    ERIC Educational Resources Information Center

    Liu, Yanlou; Tian, Wei; Xin, Tao

    2016-01-01

    The fit of cognitive diagnostic models (CDMs) to response data needs to be evaluated, since CDMs might yield misleading results when they do not fit the data well. Limited-information statistic M[subscript 2] and the associated root mean square error of approximation (RMSEA[subscript 2]) in item factor analysis were extended to evaluate the fit of…

  20. Statistical error in simulations of Poisson processes: Example of diffusion in solids

    NASA Astrophysics Data System (ADS)

    Nilsson, Johan O.; Leetmaa, Mikael; Vekilova, Olga Yu.; Simak, Sergei I.; Skorodumova, Natalia V.

    2016-08-01

    Simulations of diffusion in solids often produce poor statistics of diffusion events. We present an analytical expression for the statistical error in ion conductivity obtained in such simulations. The error expression is not restricted to any computational method in particular, but valid in the context of simulation of Poisson processes in general. This analytical error expression is verified numerically for the case of Gd-doped ceria by running a large number of kinetic Monte Carlo calculations.

  1. Standard Errors and Confidence Intervals of Norm Statistics for Educational and Psychological Tests.

    PubMed

    Oosterhuis, Hannah E M; van der Ark, L Andries; Sijtsma, Klaas

    2016-11-14

    Norm statistics allow for the interpretation of scores on psychological and educational tests, by relating the test score of an individual test taker to the test scores of individuals belonging to the same gender, age, or education groups, et cetera. Given the uncertainty due to sampling error, one would expect researchers to report standard errors for norm statistics. In practice, standard errors are seldom reported; they are either unavailable or derived under strong distributional assumptions that may not be realistic for test scores. We derived standard errors for four norm statistics (standard deviation, percentile ranks, stanine boundaries and Z-scores) under the mild assumption that the test scores are multinomially distributed. A simulation study showed that the standard errors were unbiased and that corresponding Wald-based confidence intervals had good coverage. Finally, we discuss the possibilities for applying the standard errors in practical test use in education and psychology. The procedure is provided via the R function check.norms, which is available in the mokken package.

  2. Analysis and discussion on the experimental data of electrolyte analyzer

    NASA Astrophysics Data System (ADS)

    Dong, XinYu; Jiang, JunJie; Liu, MengJun; Li, Weiwei

    2018-06-01

    In the subsequent verification of electrolyte analyzer, we found that the instrument can achieve good repeatability and stability in repeated measurements with a short period of time, in line with the requirements of verification regulation of linear error and cross contamination rate, but the phenomenon of large indication error is very common, the measurement results of different manufacturers have great difference, in order to find and solve this problem, help enterprises to improve quality of product, to obtain accurate and reliable measurement data, we conducted the experimental evaluation of electrolyte analyzer, and the data were analyzed by statistical analysis.

  3. Orbit Determination (OD) Error Analysis Results for the Triana Sun-Earth L1 Libration Point Mission and for the Fourier Kelvin Stellar Interferometer (FKSI) Sun-Earth L2 Libration Point Mission Concept

    NASA Technical Reports Server (NTRS)

    Marr, Greg C.

    2003-01-01

    The Triana spacecraft was designed to be launched by the Space Shuttle. The nominal Triana mission orbit will be a Sun-Earth L1 libration point orbit. Using the NASA Goddard Space Flight Center's Orbit Determination Error Analysis System (ODEAS), orbit determination (OD) error analysis results are presented for all phases of the Triana mission from the first correction maneuver through approximately launch plus 6 months. Results are also presented for the science data collection phase of the Fourier Kelvin Stellar Interferometer Sun-Earth L2 libration point mission concept with momentum unloading thrust perturbations during the tracking arc. The Triana analysis includes extensive analysis of an initial short arc orbit determination solution and results using both Deep Space Network (DSN) and commercial Universal Space Network (USN) statistics. These results could be utilized in support of future Sun-Earth libration point missions.

  4. A new SAS program for behavioral analysis of Electrical Penetration Graph (EPG) data

    USDA-ARS?s Scientific Manuscript database

    A new program is introduced that uses SAS software to duplicate output of descriptive statistics from the Sarria Excel workbook for EPG waveform analysis. Not only are publishable means and standard errors or deviations output, the user also is guided through four relatively simple sub-programs for ...

  5. Bayesian correction for covariate measurement error: A frequentist evaluation and comparison with regression calibration.

    PubMed

    Bartlett, Jonathan W; Keogh, Ruth H

    2018-06-01

    Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.

  6. Statistical analysis of AFE GN&C aeropass performance

    NASA Technical Reports Server (NTRS)

    Chang, Ho-Pen; French, Raymond A.

    1990-01-01

    Performance of the guidance, navigation, and control (GN&C) system used on the Aeroassist Flight Experiment (AFE) spacecraft has been studied with Monte Carlo techniques. The performance of the AFE GN&C is investigated with a 6-DOF numerical dynamic model which includes a Global Reference Atmospheric Model (GRAM) and a gravitational model with oblateness corrections. The study considers all the uncertainties due to the environment and the system itself. In the AFE's aeropass phase, perturbations on the system performance are caused by an error space which has over 20 dimensions of the correlated/uncorrelated error sources. The goal of this study is to determine, in a statistical sense, how much flight path angle error can be tolerated at entry interface (EI) and still have acceptable delta-V capability at exit to position the AFE spacecraft for recovery. Assuming there is fuel available to produce 380 ft/sec of delta-V at atmospheric exit, a 3-sigma standard deviation in flight path angle error of 0.04 degrees at EI would result in a 98-percent probability of mission success.

  7. Teaching Statistics Online Using "Excel"

    ERIC Educational Resources Information Center

    Jerome, Lawrence

    2011-01-01

    As anyone who has taught or taken a statistics course knows, statistical calculations can be tedious and error-prone, with the details of a calculation sometimes distracting students from understanding the larger concepts. Traditional statistics courses typically use scientific calculators, which can relieve some of the tedium and errors but…

  8. Multicollinearity and Regression Analysis

    NASA Astrophysics Data System (ADS)

    Daoud, Jamal I.

    2017-12-01

    In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.

  9. Probabilistic/Fracture-Mechanics Model For Service Life

    NASA Technical Reports Server (NTRS)

    Watkins, T., Jr.; Annis, C. G., Jr.

    1991-01-01

    Computer program makes probabilistic estimates of lifetime of engine and components thereof. Developed to fill need for more accurate life-assessment technique that avoids errors in estimated lives and provides for statistical assessment of levels of risk created by engineering decisions in designing system. Implements mathematical model combining techniques of statistics, fatigue, fracture mechanics, nondestructive analysis, life-cycle cost analysis, and management of engine parts. Used to investigate effects of such engine-component life-controlling parameters as return-to-service intervals, stresses, capabilities for nondestructive evaluation, and qualities of materials.

  10. M-071 critical data analysis

    NASA Technical Reports Server (NTRS)

    Hegsted, D. M.

    1975-01-01

    A prototype balance study was conducted on earth prior to the balance studies conducted in Skylab itself. Collected were daily dietary intake data of 6 minerals and nitrogen, and fecal and urinary outputs on each of three astronauts. Essential statistical issues show what quantities need to be estimated and establish the scope of inference associated with alternative variance estimates. The procedures for obtaining the final variability due both to errors of measurement and total error (total = measurement and biological variability) are exhibited.

  11. Analysis of MMU FDIR expert system

    NASA Technical Reports Server (NTRS)

    Landauer, Christopher

    1990-01-01

    This paper describes the analysis of a rulebase for fault diagnosis, isolation, and recovery for NASA's Manned Maneuvering Unit (MMU). The MMU is used by a human astronaut to move around a spacecraft in space. In order to provide maneuverability, there are several thrusters oriented in various directions, and hand-controlled devices for useful groups of them. The rulebase describes some error detection procedures, and corrective actions that can be applied in a few cases. The approach taken in this paper is to treat rulebases as symbolic objects and compute correctness and 'reasonableness' criteria that use the statistical distribution of various syntactic structures within the rulebase. The criteria should identify awkward situations, and otherwise signal anomalies that may be errors. The rulebase analysis agorithms are derived from mathematical and computational criteria that implement certain principles developed for rulebase evaluation. The principles are Consistency, Completeness, Irredundancy, Connectivity, and finally, Distribution. Several errors were detected in the delivered rulebase. Some of these errors were easily fixed. Some errors could not be fixed with the available information. A geometric model of the thruster arrangement is needed to show how to correct certain other distribution nomalies that are in fact errors. The investigations reported here were partially supported by The Aerospace Corporation's Sponsored Research Program.

  12. Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cheung, WanYin; Zhang, Jie; Florita, Anthony

    2015-12-08

    Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance,more » cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.« less

  13. Location error uncertainties - an advanced using of probabilistic inverse theory

    NASA Astrophysics Data System (ADS)

    Debski, Wojciech

    2016-04-01

    The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analyzed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. While estimating of the earthquake foci location is relatively simple a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling, and apriori uncertainties. In this presentation we addressed this task when statistics of observational and/or modeling errors are unknown. This common situation requires introduction of apriori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland we illustrate an approach based on an analysis of Shanon's entropy calculated for the aposteriori distribution. We show that this meta-characteristic of the aposteriori distribution carries some information on uncertainties of the solution found.

  14. Anatomy of emotion: a 3D study of facial mimicry.

    PubMed

    Ferrario, V F; Sforza, C

    2007-01-01

    Alterations in facial motion severely impair the quality of life and social interaction of patients, and an objective grading of facial function is necessary. A method for the non-invasive detection of 3D facial movements was developed. Sequences of six standardized facial movements (maximum smile; free smile; surprise with closed mouth; surprise with open mouth; right side eye closure; left side eye closure) were recorded in 20 healthy young adults (10 men, 10 women) using an optoelectronic motion analyzer. For each subject, 21 cutaneous landmarks were identified by 2-mm reflective markers, and their 3D movements during each facial animation were computed. Three repetitions of each expression were recorded (within-session error), and four separate sessions were used (between-session error). To assess the within-session error, the technical error of the measurement (random error, TEM) was computed separately for each sex, movement and landmark. To assess the between-session repeatability, the standard deviation among the mean displacements of each landmark (four independent sessions) was computed for each movement. TEM for the single landmarks ranged between 0.3 and 9.42 mm (intrasession error). The sex- and movement-related differences were statistically significant (two-way analysis of variance, p=0.003 for sex comparison, p=0.009 for the six movements, p<0.001 for the sex x movement interaction). Among four different (independent) sessions, the left eye closure had the worst repeatability, the right eye closure had the best one; the differences among various movements were statistically significant (one-way analysis of variance, p=0.041). In conclusion, the current protocol demonstrated a sufficient repeatability for a future clinical application. Great care should be taken to assure a consistent marker positioning in all the subjects.

  15. Coordinate based random effect size meta-analysis of neuroimaging studies.

    PubMed

    Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

    2017-06-01

    Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research

    PubMed Central

    Bakker, Marjan; Wicherts, Jelte M.

    2014-01-01

    Background The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses. Methods and Findings We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (N = 92) with results from articles that reported no exclusion of outliers (N = 61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles. Conclusions We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses. PMID:25072606

  17. Distortion Representation of Forecast Errors for Model Skill Assessment and Objective Analysis

    NASA Technical Reports Server (NTRS)

    Hoffman, Ross N.; Nehrkorn, Thomas; Grassotti, Christopher

    1998-01-01

    We proposed a novel characterization of errors for numerical weather predictions. A general distortion representation allows for the displacement and amplification or bias correction of forecast anomalies. Characterizing and decomposing forecast error in this way has several important applications, including the model assessment application and the objective analysis application. In this project, we have focused on the assessment application, restricted to a realistic but univariate 2-dimensional situation. Specifically, we study the forecast errors of the sea level pressure (SLP), the 500 hPa geopotential height, and the 315 K potential vorticity fields for forecasts of the short and medium range. The forecasts are generated by the Goddard Earth Observing System (GEOS) data assimilation system with and without ERS-1 scatterometer data. A great deal of novel work has been accomplished under the current contract. In broad terms, we have developed and tested an efficient algorithm for determining distortions. The algorithm and constraints are now ready for application to larger data sets to be used to determine the statistics of the distortion as outlined above, and to be applied in data analysis by using GEOS water vapor imagery to correct short-term forecast errors.

  18. Covariance Analysis Tool (G-CAT) for Computing Ascent, Descent, and Landing Errors

    NASA Technical Reports Server (NTRS)

    Boussalis, Dhemetrios; Bayard, David S.

    2013-01-01

    G-CAT is a covariance analysis tool that enables fast and accurate computation of error ellipses for descent, landing, ascent, and rendezvous scenarios, and quantifies knowledge error contributions needed for error budgeting purposes. Because GCAT supports hardware/system trade studies in spacecraft and mission design, it is useful in both early and late mission/ proposal phases where Monte Carlo simulation capability is not mature, Monte Carlo simulation takes too long to run, and/or there is a need to perform multiple parametric system design trades that would require an unwieldy number of Monte Carlo runs. G-CAT is formulated as a variable-order square-root linearized Kalman filter (LKF), typically using over 120 filter states. An important property of G-CAT is that it is based on a 6-DOF (degrees of freedom) formulation that completely captures the combined effects of both attitude and translation errors on the propagated trajectories. This ensures its accuracy for guidance, navigation, and control (GN&C) analysis. G-CAT provides the desired fast turnaround analysis needed for error budgeting in support of mission concept formulations, design trade studies, and proposal development efforts. The main usefulness of a covariance analysis tool such as G-CAT is its ability to calculate the performance envelope directly from a single run. This is in sharp contrast to running thousands of simulations to obtain similar information using Monte Carlo methods. It does this by propagating the "statistics" of the overall design, rather than simulating individual trajectories. G-CAT supports applications to lunar, planetary, and small body missions. It characterizes onboard knowledge propagation errors associated with inertial measurement unit (IMU) errors (gyro and accelerometer), gravity errors/dispersions (spherical harmonics, masscons), and radar errors (multiple altimeter beams, multiple Doppler velocimeter beams). G-CAT is a standalone MATLAB- based tool intended to run on any engineer's desktop computer.

  19. Network problem threshold

    NASA Technical Reports Server (NTRS)

    Gejji, Raghvendra, R.

    1992-01-01

    Network transmission errors such as collisions, CRC errors, misalignment, etc. are statistical in nature. Although errors can vary randomly, a high level of errors does indicate specific network problems, e.g. equipment failure. In this project, we have studied the random nature of collisions theoretically as well as by gathering statistics, and established a numerical threshold above which a network problem is indicated with high probability.

  20. Analysis/forecast experiments with a flow-dependent correlation function using FGGE data

    NASA Technical Reports Server (NTRS)

    Baker, W. E.; Bloom, S. C.; Carus, H.; Nestler, M. S.

    1986-01-01

    The use of a flow-dependent correlation function to improve the accuracy of an optimum interpolation (OI) scheme is examined. The development of the correlation function for the OI analysis scheme used for numerical weather prediction is described. The scheme uses a multivariate surface analysis over the oceans to model the pressure-wind error cross-correlation and it has the ability to use an error correlation function that is flow- and geographically-dependent. A series of four-day data assimilation experiments, conducted from January 5-9, 1979, were used to investigate the effect of the different features of the OI scheme (error correlation) on forecast skill for the barotropic lows and highs. The skill of the OI was compared with that of a successive correlation method (SCM) of analysis. It is observed that the largest difference in the correlation statistics occurred in barotropic and baroclinic lows and highs. The comparison reveals that the OI forecasts were more accurate than the SCM forecasts.

  1. Performance Reports: Mirror alignment system performance prediction comparison between SAO and EKC

    NASA Technical Reports Server (NTRS)

    Tananbaum, H. D.; Zhang, J. P.

    1994-01-01

    The objective of this study is to perform an independent analysis of the residual high resolution mirror assembly (HRMA) mirror distortions caused by force and moment errors in the mirror alignment system (MAS) to statistically predict the HRMA performance. These performance predictions are then compared with those performed by Kodak to verify their analysis results.

  2. Drought Persistence Errors in Global Climate Models

    NASA Astrophysics Data System (ADS)

    Moon, H.; Gudmundsson, L.; Seneviratne, S. I.

    2018-04-01

    The persistence of drought events largely determines the severity of socioeconomic and ecological impacts, but the capability of current global climate models (GCMs) to simulate such events is subject to large uncertainties. In this study, the representation of drought persistence in GCMs is assessed by comparing state-of-the-art GCM model simulations to observation-based data sets. For doing so, we consider dry-to-dry transition probabilities at monthly and annual scales as estimates for drought persistence, where a dry status is defined as negative precipitation anomaly. Though there is a substantial spread in the drought persistence bias, most of the simulations show systematic underestimation of drought persistence at global scale. Subsequently, we analyzed to which degree (i) inaccurate observations, (ii) differences among models, (iii) internal climate variability, and (iv) uncertainty of the employed statistical methods contribute to the spread in drought persistence errors using an analysis of variance approach. The results show that at monthly scale, model uncertainty and observational uncertainty dominate, while the contribution from internal variability is small in most cases. At annual scale, the spread of the drought persistence error is dominated by the statistical estimation error of drought persistence, indicating that the partitioning of the error is impaired by the limited number of considered time steps. These findings reveal systematic errors in the representation of drought persistence in current GCMs and suggest directions for further model improvement.

  3. Does raising type 1 error rate improve power to detect interactions in linear regression models? A simulation study.

    PubMed

    Durand, Casey P

    2013-01-01

    Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model. A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations. In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified. Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.

  4. Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

    PubMed

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

    2012-01-01

    As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright © 2013 S. Karger AG, Basel.

  5. Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

    PubMed Central

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

    2013-01-01

    As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci. PMID:23594495

  6. Statistics 101 for Radiologists.

    PubMed

    Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

    2015-10-01

    Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.

  7. Constraining the mass–richness relationship of redMaPPer clusters with angular clustering

    DOE PAGES

    Baxter, Eric J.; Rozo, Eduardo; Jain, Bhuvnesh; ...

    2016-08-04

    The potential of using cluster clustering for calibrating the mass–richness relation of galaxy clusters has been recognized theoretically for over a decade. In this paper, we demonstrate the feasibility of this technique to achieve high-precision mass calibration using redMaPPer clusters in the Sloan Digital Sky Survey North Galactic Cap. By including cross-correlations between several richness bins in our analysis, we significantly improve the statistical precision of our mass constraints. The amplitude of the mass–richness relation is constrained to 7 per cent statistical precision by our analysis. However, the error budget is systematics dominated, reaching a 19 per cent total errormore » that is dominated by theoretical uncertainty in the bias–mass relation for dark matter haloes. We confirm the result from Miyatake et al. that the clustering amplitude of redMaPPer clusters depends on galaxy concentration as defined therein, and we provide additional evidence that this dependence cannot be sourced by mass dependences: some other effect must account for the observed variation in clustering amplitude with galaxy concentration. Assuming that the observed dependence of redMaPPer clustering on galaxy concentration is a form of assembly bias, we find that such effects introduce a systematic error on the amplitude of the mass–richness relation that is comparable to the error bar from statistical noise. Finally, the results presented here demonstrate the power of cluster clustering for mass calibration and cosmology provided the current theoretical systematics can be ameliorated.« less

  8. Statistical aspects of the TNK-S2B trial of tenecteplase versus alteplase in acute ischemic stroke: an efficient, dose-adaptive, seamless phase II/III design.

    PubMed

    Levin, Bruce; Thompson, John L P; Chakraborty, Bibhas; Levy, Gilberto; MacArthur, Robert; Haley, E Clarke

    2011-08-01

    TNK-S2B, an innovative, randomized, seamless phase II/III trial of tenecteplase versus rt-PA for acute ischemic stroke, terminated for slow enrollment before regulatory approval of use of phase II patients in phase III. (1) To review the trial design and comprehensive type I error rate simulations and (2) to discuss issues raised during regulatory review, to facilitate future approval of similar designs. In phase II, an early (24-h) outcome and adaptive sequential procedure selected one of three tenecteplase doses for phase III comparison with rt-PA. Decision rules comparing this dose to rt-PA would cause stopping for futility at phase II end, or continuation to phase III. Phase III incorporated two co-primary hypotheses, allowing for a treatment effect at either end of the trichotomized Rankin scale. Assuming no early termination, four interim analyses and one final analysis of 1908 patients provided an experiment-wise type I error rate of <0.05. Over 1,000 distribution scenarios, each involving 40,000 replications, the maximum type I error in phase III was 0.038. Inflation from the dose selection was more than offset by the one-half continuity correction in the test statistics. Inflation from repeated interim analyses was more than offset by the reduction from the clinical stopping rules for futility at the first interim analysis. Design complexity and evolving regulatory requirements lengthened the review process. (1) The design was innovative and efficient. Per protocol, type I error was well controlled for the co-primary phase III hypothesis tests, and experiment-wise. (2a) Time must be allowed for communications with regulatory reviewers from first design stages. (2b) Adequate type I error control must be demonstrated. (2c) Greater clarity is needed on (i) whether this includes demonstration of type I error control if the protocol is violated and (ii) whether simulations of type I error control are acceptable. (2d) Regulatory agency concerns that protocols for futility stopping may not be followed may be allayed by submitting interim analysis results to them as these analyses occur.

  9. Analysis of potential errors in real-time streamflow data and methods of data verification by digital computer

    USGS Publications Warehouse

    Lystrom, David J.

    1972-01-01

    Various methods of verifying real-time streamflow data are outlined in part II. Relatively large errors (those greater than 20-30 percent) can be detected readily by use of well-designed verification programs for a digital computer, and smaller errors can be detected only by discharge measurements and field observations. The capability to substitute a simulated discharge value for missing or erroneous data is incorporated in some of the verification routines described. The routines represent concepts ranging from basic statistical comparisons to complex watershed modeling and provide a selection from which real-time data users can choose a suitable level of verification.

  10. Efficiency degradation due to tracking errors for point focusing solar collectors

    NASA Technical Reports Server (NTRS)

    Hughes, R. O.

    1978-01-01

    An important parameter in the design of point focusing solar collectors is the intercept factor which is a measure of efficiency and of energy available for use in the receiver. Using statistical methods, an expression of the expected value of the intercept factor is derived for various configurations and control law implementations. The analysis assumes that a radially symmetric flux distribution (not necessarily Gaussian) is generated at the focal plane due to the sun's finite image and various reflector errors. The time-varying tracking errors are assumed to be uniformly distributed within the threshold limits and allows the expected value calculation.

  11. Cannonical [sic] confusions, an illusory allusion, and more: a critique of Haggbloom, et al.'s list of eminent psychologists (2002).

    PubMed

    Black, Stephen L

    2003-06-01

    The analysis by Haggbloom, et al. (2002) establishing a list of the most eminent psychologists of the 20th century contains significant errors. In one case the achievements of Walter B. Cannon are misattributed to W. Gary Cannon. Other errors are eponyms misattributed to Margaret F. Washburn, Morton Deutsch, Wolfgang Köhler, and G. Stanley Hall. A further mistake is to miscalculate the statistic for introductory psychology textbook citations for Hans J. Eysenck. These errors have consequences for the ranking of individuals on this list. Care must be taken to guard against such mistakes.

  12. Controlling false-negative errors in microarray differential expression analysis: a PRIM approach.

    PubMed

    Cole, Steve W; Galic, Zoran; Zack, Jerome A

    2003-09-22

    Theoretical considerations suggest that current microarray screening algorithms may fail to detect many true differences in gene expression (Type II analytic errors). We assessed 'false negative' error rates in differential expression analyses by conventional linear statistical models (e.g. t-test), microarray-adapted variants (e.g. SAM, Cyber-T), and a novel strategy based on hold-out cross-validation. The latter approach employs the machine-learning algorithm Patient Rule Induction Method (PRIM) to infer minimum thresholds for reliable change in gene expression from Boolean conjunctions of fold-induction and raw fluorescence measurements. Monte Carlo analyses based on four empirical data sets show that conventional statistical models and their microarray-adapted variants overlook more than 50% of genes showing significant up-regulation. Conjoint PRIM prediction rules recover approximately twice as many differentially expressed transcripts while maintaining strong control over false-positive (Type I) errors. As a result, experimental replication rates increase and total analytic error rates decline. RT-PCR studies confirm that gene inductions detected by PRIM but overlooked by other methods represent true changes in mRNA levels. PRIM-based conjoint inference rules thus represent an improved strategy for high-sensitivity screening of DNA microarrays. Freestanding JAVA application at http://microarray.crump.ucla.edu/focus

  13. Error and its meaning in forensic science.

    PubMed

    Christensen, Angi M; Crowder, Christian M; Ousley, Stephen D; Houck, Max M

    2014-01-01

    The discussion of "error" has gained momentum in forensic science in the wake of the Daubert guidelines and has intensified with the National Academy of Sciences' Report. Error has many different meanings, and too often, forensic practitioners themselves as well as the courts misunderstand scientific error and statistical error rates, often confusing them with practitioner error (or mistakes). Here, we present an overview of these concepts as they pertain to forensic science applications, discussing the difference between practitioner error (including mistakes), instrument error, statistical error, and method error. We urge forensic practitioners to ensure that potential sources of error and method limitations are understood and clearly communicated and advocate that the legal community be informed regarding the differences between interobserver errors, uncertainty, variation, and mistakes. © 2013 American Academy of Forensic Sciences.

  14. Death Certification Errors and the Effect on Mortality Statistics.

    PubMed

    McGivern, Lauri; Shulman, Leanne; Carney, Jan K; Shapiro, Steven; Bundock, Elizabeth

    Errors in cause and manner of death on death certificates are common and affect families, mortality statistics, and public health research. The primary objective of this study was to characterize errors in the cause and manner of death on death certificates completed by non-Medical Examiners. A secondary objective was to determine the effects of errors on national mortality statistics. We retrospectively compared 601 death certificates completed between July 1, 2015, and January 31, 2016, from the Vermont Electronic Death Registration System with clinical summaries from medical records. Medical Examiners, blinded to original certificates, reviewed summaries, generated mock certificates, and compared mock certificates with original certificates. They then graded errors using a scale from 1 to 4 (higher numbers indicated increased impact on interpretation of the cause) to determine the prevalence of minor and major errors. They also compared International Classification of Diseases, 10th Revision (ICD-10) codes on original certificates with those on mock certificates. Of 601 original death certificates, 319 (53%) had errors; 305 (51%) had major errors; and 59 (10%) had minor errors. We found no significant differences by certifier type (physician vs nonphysician). We did find significant differences in major errors in place of death ( P < .001). Certificates for deaths occurring in hospitals were more likely to have major errors than certificates for deaths occurring at a private residence (59% vs 39%, P < .001). A total of 580 (93%) death certificates had a change in ICD-10 codes between the original and mock certificates, of which 348 (60%) had a change in the underlying cause-of-death code. Error rates on death certificates in Vermont are high and extend to ICD-10 coding, thereby affecting national mortality statistics. Surveillance and certifier education must expand beyond local and state efforts. Simplifying and standardizing underlying literal text for cause of death may improve accuracy, decrease coding errors, and improve national mortality statistics.

  15. An Analysis of Operational Suitability for Test and Evaluation of Highly Reliable Systems

    DTIC Science & Technology

    1994-03-04

    Exposition," Journal of the American Statistical A iation-59: 353-375 (June 1964). 17. SYS 229, Test and Evaluation Management Coursebook , School of Systems...in hours, 0 is 2-5 the desired MTBCF in hours, R is the number of critical failures, and a is the P[type-I error] of the X2 statistic with 2*R+2...design of experiments (DOE) tables and the use of Bayesian statistics to increase the confidence level of the test results that will be obtained from

  16. Impact of Assimilating Surface Velocity Observations on the Model Sea Surface Height Using the NCOM-4DVAR

    DTIC Science & Technology

    2016-09-26

    statistical analysis is done by not only examining the SSH forecast error across the entire do- main, but also by concentrating on the areamost densely covered...over (b) entire GoM domain and (d) GLAD region only. Statistics shown for FR (thin black), SSH1 (thick black), and VEL (gray) experiment 96-h SSH...coefficient. To statistically FIG. 9. Sea surface height (m) for AVISO (a) 1 Aug, (b) 20 Aug, (c) 10 Sep, and (d) 30 Sep; for SSH1 experiment (e) 1

  17. Statistical Modeling for Radiation Hardness Assurance

    NASA Technical Reports Server (NTRS)

    Ladbury, Raymond L.

    2014-01-01

    We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.

  18. Sensor Analytics: Radioactive gas Concentration Estimation and Error Propagation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, Dale N.; Fagan, Deborah K.; Suarez, Reynold

    2007-04-15

    This paper develops the mathematical statistics of a radioactive gas quantity measurement and associated error propagation. The probabilistic development is a different approach to deriving attenuation equations and offers easy extensions to more complex gas analysis components through simulation. The mathematical development assumes a sequential process of three components; I) the collection of an environmental sample, II) component gas extraction from the sample through the application of gas separation chemistry, and III) the estimation of radioactivity of component gases.

  19. Considerations in the design of large space structures

    NASA Technical Reports Server (NTRS)

    Hedgepeth, J. M.; Macneal, R. H.; Knapp, K.; Macgillivray, C. S.

    1981-01-01

    Several analytical studies of topics relevant to the design of large space structures are presented. Topics covered are: the types and quantitative evaluation of the disturbances to which large Earth-oriented microwave reflectors would be subjected and the resulting attitude errors of such spacecraft; the influence of errors in the structural geometry of the performance of radiofrequency antennas; the effect of creasing on the flatness of tensioned reflector membrane surface; and an analysis of the statistics of damage to truss-type structures due to meteoroids.

  20. Zonal average earth radiation budget measurements from satellites for climate studies

    NASA Technical Reports Server (NTRS)

    Ellis, J. S.; Haar, T. H. V.

    1976-01-01

    Data from 29 months of satellite radiation budget measurements, taken intermittently over the period 1964 through 1971, are composited into mean month, season and annual zonally averaged meridional profiles. Individual months, which comprise the 29 month set, were selected as representing the best available total flux data for compositing into large scale statistics for climate studies. A discussion of spatial resolution of the measurements along with an error analysis, including both the uncertainty and standard error of the mean, are presented.

  1. The vocabulary profile of Slovak children with primary language impairment compared to typically developing Slovak children measured by LITMUS-CLT.

    PubMed

    Kapalková, Svetlana; Slančová, Daniela

    2017-01-01

    This study compared a sample of children with primary language impairment (PLI) and typically developing age-matched children using the crosslinguistic lexical tasks (CLT-SK). We also compared the PLI children with typically developing language-matched younger children who were matched on the basis of receptive vocabulary. Overall, statistical testing showed that the vocabulary of the PLI children was significantly different from the vocabulary of the age-matched children, but not statistically different from the younger children who were matched on the basis of their receptive vocabulary size. Qualitative analysis of the correct answers revealed that the PLI children showed higher rigidity compared to the younger language-matched children who are able to use more synonyms or derivations across word class in naming tasks. Similarly, an examination of the children's naming errors indicated that the language-matched children exhibited more semantic errors, whereas PLI children showed more associative errors.

  2. Estimating errors in least-squares fitting

    NASA Technical Reports Server (NTRS)

    Richter, P. H.

    1995-01-01

    While least-squares fitting procedures are commonly used in data analysis and are extensively discussed in the literature devoted to this subject, the proper assessment of errors resulting from such fits has received relatively little attention. The present work considers statistical errors in the fitted parameters, as well as in the values of the fitted function itself, resulting from random errors in the data. Expressions are derived for the standard error of the fit, as a function of the independent variable, for the general nonlinear and linear fitting problems. Additionally, closed-form expressions are derived for some examples commonly encountered in the scientific and engineering fields, namely ordinary polynomial and Gaussian fitting functions. These results have direct application to the assessment of the antenna gain and system temperature characteristics, in addition to a broad range of problems in data analysis. The effects of the nature of the data and the choice of fitting function on the ability to accurately model the system under study are discussed, and some general rules are deduced to assist workers intent on maximizing the amount of information obtained form a given set of measurements.

  3. The (mis)reporting of statistical results in psychology journals.

    PubMed

    Bakker, Marjan; Wicherts, Jelte M

    2011-09-01

    In order to study the prevalence, nature (direction), and causes of reporting errors in psychology, we checked the consistency of reported test statistics, degrees of freedom, and p values in a random sample of high- and low-impact psychology journals. In a second study, we established the generality of reporting errors in a random sample of recent psychological articles. Our results, on the basis of 281 articles, indicate that around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers' expectations. We classified the most common errors and contacted authors to shed light on the origins of the errors.

  4. Sources of Error and the Statistical Formulation of M S: m b Seismic Event Screening Analysis

    NASA Astrophysics Data System (ADS)

    Anderson, D. N.; Patton, H. J.; Taylor, S. R.; Bonner, J. L.; Selby, N. D.

    2014-03-01

    The Comprehensive Nuclear-Test-Ban Treaty (CTBT), a global ban on nuclear explosions, is currently in a ratification phase. Under the CTBT, an International Monitoring System (IMS) of seismic, hydroacoustic, infrasonic and radionuclide sensors is operational, and the data from the IMS is analysed by the International Data Centre (IDC). The IDC provides CTBT signatories basic seismic event parameters and a screening analysis indicating whether an event exhibits explosion characteristics (for example, shallow depth). An important component of the screening analysis is a statistical test of the null hypothesis H 0: explosion characteristics using empirical measurements of seismic energy (magnitudes). The established magnitude used for event size is the body-wave magnitude (denoted m b) computed from the initial segment of a seismic waveform. IDC screening analysis is applied to events with m b greater than 3.5. The Rayleigh wave magnitude (denoted M S) is a measure of later arriving surface wave energy. Magnitudes are measurements of seismic energy that include adjustments (physical correction model) for path and distance effects between event and station. Relative to m b, earthquakes generally have a larger M S magnitude than explosions. This article proposes a hypothesis test (screening analysis) using M S and m b that expressly accounts for physical correction model inadequacy in the standard error of the test statistic. With this hypothesis test formulation, the 2009 Democratic Peoples Republic of Korea announced nuclear weapon test fails to reject the null hypothesis H 0: explosion characteristics.

  5. Bias correction of bounded location errors in presence-only data

    USGS Publications Warehouse

    Hefley, Trevor J.; Brost, Brian M.; Hooten, Mevin B.

    2017-01-01

    Location error occurs when the true location is different than the reported location. Because habitat characteristics at the true location may be different than those at the reported location, ignoring location error may lead to unreliable inference concerning species–habitat relationships.We explain how a transformation known in the spatial statistics literature as a change of support (COS) can be used to correct for location errors when the true locations are points with unknown coordinates contained within arbitrary shaped polygons.We illustrate the flexibility of the COS by modelling the resource selection of Whooping Cranes (Grus americana) using citizen contributed records with locations that were reported with error. We also illustrate the COS with a simulation experiment.In our analysis of Whooping Crane resource selection, we found that location error can result in up to a five-fold change in coefficient estimates. Our simulation study shows that location error can result in coefficient estimates that have the wrong sign, but a COS can efficiently correct for the bias.

  6. Robustness of Multiple Objective Decision Analysis Preference Functions

    DTIC Science & Technology

    2002-06-01

    p p′ : The probability of some event. ,i ip q : The probability of event . i Π : An aggregation of proportional data used in calculating a test ...statistical tests of the significance of the term and also is conducted in a multivariate framework rather than the ROSA univariate approach. A...residual error is ˆ−e = y y (45) The coefficient provides a ready indicator of the contribution for the associated variable and statistical tests

  7. A Non-Intrusive Algorithm for Sensitivity Analysis of Chaotic Flow Simulations

    NASA Technical Reports Server (NTRS)

    Blonigan, Patrick J.; Wang, Qiqi; Nielsen, Eric J.; Diskin, Boris

    2017-01-01

    We demonstrate a novel algorithm for computing the sensitivity of statistics in chaotic flow simulations to parameter perturbations. The algorithm is non-intrusive but requires exposing an interface. Based on the principle of shadowing in dynamical systems, this algorithm is designed to reduce the effect of the sampling error in computing sensitivity of statistics in chaotic simulations. We compare the effectiveness of this method to that of the conventional finite difference method.

  8. Modeling and characterization of multipath in global navigation satellite system ranging signals

    NASA Astrophysics Data System (ADS)

    Weiss, Jan Peter

    The Global Positioning System (GPS) provides position, velocity, and time information to users in anywhere near the earth in real-time and regardless of weather conditions. Since the system became operational, improvements in many areas have reduced systematic errors affecting GPS measurements such that multipath, defined as any signal taking a path other than the direct, has become a significant, if not dominant, error source for many applications. This dissertation utilizes several approaches to characterize and model multipath errors in GPS measurements. Multipath errors in GPS ranging signals are characterized for several receiver systems and environments. Experimental P(Y) code multipath data are analyzed for ground stations with multipath levels ranging from minimal to severe, a C-12 turboprop, an F-18 jet, and an aircraft carrier. Comparisons between receivers utilizing single patch antennas and multi-element arrays are also made. In general, the results show significant reductions in multipath with antenna array processing, although large errors can occur even with this kind of equipment. Analysis of airborne platform multipath shows that the errors tend to be small in magnitude because the size of the aircraft limits the geometric delay of multipath signals, and high in frequency because aircraft dynamics cause rapid variations in geometric delay. A comprehensive multipath model is developed and validated. The model integrates 3D structure models, satellite ephemerides, electromagnetic ray-tracing algorithms, and detailed antenna and receiver models to predict multipath errors. Validation is performed by comparing experimental and simulated multipath via overall error statistics, per satellite time histories, and frequency content analysis. The validation environments include two urban buildings, an F-18, an aircraft carrier, and a rural area where terrain multipath dominates. The validated models are used to identify multipath sources, characterize signal properties, evaluate additional antenna and receiver tracking configurations, and estimate the reflection coefficients of multipath-producing surfaces. Dynamic models for an F-18 landing on an aircraft carrier correlate aircraft dynamics to multipath frequency content; the model also characterizes the separate contributions of multipath due to the aircraft, ship, and ocean to the overall error statistics. Finally, reflection coefficients for multipath produced by terrain are estimated via a least-squares algorithm.

  9. Physics-based statistical learning approach to mesoscopic model selection.

    PubMed

    Taverniers, Søren; Haut, Terry S; Barros, Kipton; Alexander, Francis J; Lookman, Turab

    2015-11-01

    In materials science and many other research areas, models are frequently inferred without considering their generalization to unseen data. We apply statistical learning using cross-validation to obtain an optimally predictive coarse-grained description of a two-dimensional kinetic nearest-neighbor Ising model with Glauber dynamics (GD) based on the stochastic Ginzburg-Landau equation (sGLE). The latter is learned from GD "training" data using a log-likelihood analysis, and its predictive ability for various complexities of the model is tested on GD "test" data independent of the data used to train the model on. Using two different error metrics, we perform a detailed analysis of the error between magnetization time trajectories simulated using the learned sGLE coarse-grained description and those obtained using the GD model. We show that both for equilibrium and out-of-equilibrium GD training trajectories, the standard phenomenological description using a quartic free energy does not always yield the most predictive coarse-grained model. Moreover, increasing the amount of training data can shift the optimal model complexity to higher values. Our results are promising in that they pave the way for the use of statistical learning as a general tool for materials modeling and discovery.

  10. A heteroskedastic error covariance matrix estimator using a first-order conditional autoregressive Markov simulation for deriving asympotical efficient estimates from ecological sampled Anopheles arabiensis aquatic habitat covariates

    PubMed Central

    Jacob, Benjamin G; Griffith, Daniel A; Muturi, Ephantus J; Caamano, Erick X; Githure, John I; Novak, Robert J

    2009-01-01

    Background Autoregressive regression coefficients for Anopheles arabiensis aquatic habitat models are usually assessed using global error techniques and are reported as error covariance matrices. A global statistic, however, will summarize error estimates from multiple habitat locations. This makes it difficult to identify where there are clusters of An. arabiensis aquatic habitats of acceptable prediction. It is therefore useful to conduct some form of spatial error analysis to detect clusters of An. arabiensis aquatic habitats based on uncertainty residuals from individual sampled habitats. In this research, a method of error estimation for spatial simulation models was demonstrated using autocorrelation indices and eigenfunction spatial filters to distinguish among the effects of parameter uncertainty on a stochastic simulation of ecological sampled Anopheles aquatic habitat covariates. A test for diagnostic checking error residuals in an An. arabiensis aquatic habitat model may enable intervention efforts targeting productive habitats clusters, based on larval/pupal productivity, by using the asymptotic distribution of parameter estimates from a residual autocovariance matrix. The models considered in this research extends a normal regression analysis previously considered in the literature. Methods Field and remote-sampled data were collected during July 2006 to December 2007 in Karima rice-village complex in Mwea, Kenya. SAS 9.1.4® was used to explore univariate statistics, correlations, distributions, and to generate global autocorrelation statistics from the ecological sampled datasets. A local autocorrelation index was also generated using spatial covariance parameters (i.e., Moran's Indices) in a SAS/GIS® database. The Moran's statistic was decomposed into orthogonal and uncorrelated synthetic map pattern components using a Poisson model with a gamma-distributed mean (i.e. negative binomial regression). The eigenfunction values from the spatial configuration matrices were then used to define expectations for prior distributions using a Markov chain Monte Carlo (MCMC) algorithm. A set of posterior means were defined in WinBUGS 1.4.3®. After the model had converged, samples from the conditional distributions were used to summarize the posterior distribution of the parameters. Thereafter, a spatial residual trend analyses was used to evaluate variance uncertainty propagation in the model using an autocovariance error matrix. Results By specifying coefficient estimates in a Bayesian framework, the covariate number of tillers was found to be a significant predictor, positively associated with An. arabiensis aquatic habitats. The spatial filter models accounted for approximately 19% redundant locational information in the ecological sampled An. arabiensis aquatic habitat data. In the residual error estimation model there was significant positive autocorrelation (i.e., clustering of habitats in geographic space) based on log-transformed larval/pupal data and the sampled covariate depth of habitat. Conclusion An autocorrelation error covariance matrix and a spatial filter analyses can prioritize mosquito control strategies by providing a computationally attractive and feasible description of variance uncertainty estimates for correctly identifying clusters of prolific An. arabiensis aquatic habitats based on larval/pupal productivity. PMID:19772590

  11. Short-Term Load Forecasting Error Distributions and Implications for Renewable Integration Studies: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hodge, B. M.; Lew, D.; Milligan, M.

    2013-01-01

    Load forecasting in the day-ahead timescale is a critical aspect of power system operations that is used in the unit commitment process. It is also an important factor in renewable energy integration studies, where the combination of load and wind or solar forecasting techniques create the net load uncertainty that must be managed by the economic dispatch process or with suitable reserves. An understanding of that load forecasting errors that may be expected in this process can lead to better decisions about the amount of reserves necessary to compensate errors. In this work, we performed a statistical analysis of themore » day-ahead (and two-day-ahead) load forecasting errors observed in two independent system operators for a one-year period. Comparisons were made with the normal distribution commonly assumed in power system operation simulations used for renewable power integration studies. Further analysis identified time periods when the load is more likely to be under- or overforecast.« less

  12. Error correction and diversity analysis of population mixtures determined by NGS

    PubMed Central

    Burroughs, Nigel J.; Evans, David J.; Ryabov, Eugene V.

    2014-01-01

    The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees. The paper has two findings. First, a method for correction of next generation sequencing error in the distribution of nucleotides at a site is developed. Second, a package of methods for assessment of nucleotide diversity is assembled. The error correction method is statistically based and works at the level of the nucleotide distribution rather than the level of individual nucleotides. The method relies on an error model and a sample of known viral genotypes that is used for model calibration. A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated. The methods are illustrated using honeybee viral samples. Software in both Excel and Matlab and a guide are available at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/, the Warwick University Systems Biology Centre software download site. PMID:25405074

  13. Performance Analysis of an Inter-Relay Co-operation in FSO Communication System

    NASA Astrophysics Data System (ADS)

    Khanna, Himanshu; Aggarwal, Mona; Ahuja, Swaran

    2018-04-01

    In this work, we analyze the outage and error performance of a one-way inter-relay assisted free space optical link. The assumption of the absence of direct link between the source and destination node is being made for the analysis, and the feasibility of such system configuration is studied. We consider the influence of path loss, atmospheric turbulence and pointing error impairments, and investigate the effect of these parameters on the system performance. The turbulence-induced fading is modeled by independent but not necessarily identically distributed gamma-gamma fading statistics. The closed-form expressions for outage probability and probability of error are derived and illustrated by numerical plots. It is concluded that the absence of line of sight path between source and destination nodes does not lead to significant performance degradation. Moreover, for the system model under consideration, interconnected relaying provides better error performance than the non-interconnected relaying and dual-hop serial relaying techniques.

  14. Bayesian generalized least squares regression with application to log Pearson type 3 regional skew estimation

    NASA Astrophysics Data System (ADS)

    Reis, D. S.; Stedinger, J. R.; Martins, E. S.

    2005-10-01

    This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.

  15. Adaptive Error Estimation in Linearized Ocean General Circulation Models

    NASA Technical Reports Server (NTRS)

    Chechelnitsky, Michael Y.

    1999-01-01

    Data assimilation methods are routinely used in oceanography. The statistics of the model and measurement errors need to be specified a priori. This study addresses the problem of estimating model and measurement error statistics from observations. We start by testing innovation based methods of adaptive error estimation with low-dimensional models in the North Pacific (5-60 deg N, 132-252 deg E) to TOPEX/POSEIDON (TIP) sea level anomaly data, acoustic tomography data from the ATOC project, and the MIT General Circulation Model (GCM). A reduced state linear model that describes large scale internal (baroclinic) error dynamics is used. The methods are shown to be sensitive to the initial guess for the error statistics and the type of observations. A new off-line approach is developed, the covariance matching approach (CMA), where covariance matrices of model-data residuals are "matched" to their theoretical expectations using familiar least squares methods. This method uses observations directly instead of the innovations sequence and is shown to be related to the MT method and the method of Fu et al. (1993). Twin experiments using the same linearized MIT GCM suggest that altimetric data are ill-suited to the estimation of internal GCM errors, but that such estimates can in theory be obtained using acoustic data. The CMA is then applied to T/P sea level anomaly data and a linearization of a global GFDL GCM which uses two vertical modes. We show that the CMA method can be used with a global model and a global data set, and that the estimates of the error statistics are robust. We show that the fraction of the GCM-T/P residual variance explained by the model error is larger than that derived in Fukumori et al.(1999) with the method of Fu et al.(1993). Most of the model error is explained by the barotropic mode. However, we find that impact of the change in the error statistics on the data assimilation estimates is very small. This is explained by the large representation error, i.e. the dominance of the mesoscale eddies in the T/P signal, which are not part of the 21 by 1" GCM. Therefore, the impact of the observations on the assimilation is very small even after the adjustment of the error statistics. This work demonstrates that simult&neous estimation of the model and measurement error statistics for data assimilation with global ocean data sets and linearized GCMs is possible. However, the error covariance estimation problem is in general highly underdetermined, much more so than the state estimation problem. In other words there exist a very large number of statistical models that can be made consistent with the available data. Therefore, methods for obtaining quantitative error estimates, powerful though they may be, cannot replace physical insight. Used in the right context, as a tool for guiding the choice of a small number of model error parameters, covariance matching can be a useful addition to the repertory of tools available to oceanographers.

  16. Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.

    PubMed

    Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia

    2017-06-01

    Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.

  17. Combined proportional and additive residual error models in population pharmacokinetic modelling.

    PubMed

    Proost, Johannes H

    2017-11-15

    In pharmacokinetic modelling, a combined proportional and additive residual error model is often preferred over a proportional or additive residual error model. Different approaches have been proposed, but a comparison between approaches is still lacking. The theoretical background of the methods is described. Method VAR assumes that the variance of the residual error is the sum of the statistically independent proportional and additive components; this method can be coded in three ways. Method SD assumes that the standard deviation of the residual error is the sum of the proportional and additive components. Using datasets from literature and simulations based on these datasets, the methods are compared using NONMEM. The different coding of methods VAR yield identical results. Using method SD, the values of the parameters describing residual error are lower than for method VAR, but the values of the structural parameters and their inter-individual variability are hardly affected by the choice of the method. Both methods are valid approaches in combined proportional and additive residual error modelling, and selection may be based on OFV. When the result of an analysis is used for simulation purposes, it is essential that the simulation tool uses the same method as used during analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Analysis of Medication Error Reports

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whitney, Paul D.; Young, Jonathan; Santell, John

    In medicine, as in many areas of research, technological innovation and the shift from paper based information to electronic records has created a climate of ever increasing availability of raw data. There has been, however, a corresponding lag in our abilities to analyze this overwhelming mass of data, and classic forms of statistical analysis may not allow researchers to interact with data in the most productive way. This is true in the emerging area of patient safety improvement. Traditionally, a majority of the analysis of error and incident reports has been carried out based on an approach of data comparison,more » and starts with a specific question which needs to be answered. Newer data analysis tools have been developed which allow the researcher to not only ask specific questions but also to “mine” data: approach an area of interest without preconceived questions, and explore the information dynamically, allowing questions to be formulated based on patterns brought up by the data itself. Since 1991, United States Pharmacopeia (USP) has been collecting data on medication errors through voluntary reporting programs. USP’s MEDMARXsm reporting program is the largest national medication error database and currently contains well over 600,000 records. Traditionally, USP has conducted an annual quantitative analysis of data derived from “pick-lists” (i.e., items selected from a list of items) without an in-depth analysis of free-text fields. In this paper, the application of text analysis and data analysis tools used by Battelle to analyze the medication error reports already analyzed in the traditional way by USP is described. New insights and findings were revealed including the value of language normalization and the distribution of error incidents by day of the week. The motivation for this effort is to gain additional insight into the nature of medication errors to support improvements in medication safety.« less

  19. Low power and type II errors in recent ophthalmology research.

    PubMed

    Khan, Zainab; Milko, Jordan; Iqbal, Munir; Masri, Moness; Almeida, David R P

    2016-10-01

    To investigate the power of unpaired t tests in prospective, randomized controlled trials when these tests failed to detect a statistically significant difference and to determine the frequency of type II errors. Systematic review and meta-analysis. We examined all prospective, randomized controlled trials published between 2010 and 2012 in 4 major ophthalmology journals (Archives of Ophthalmology, British Journal of Ophthalmology, Ophthalmology, and American Journal of Ophthalmology). Studies that used unpaired t tests were included. Power was calculated using the number of subjects in each group, standard deviations, and α = 0.05. The difference between control and experimental means was set to be (1) 20% and (2) 50% of the absolute value of the control's initial conditions. Power and Precision version 4.0 software was used to carry out calculations. Finally, the proportion of articles with type II errors was calculated. β = 0.3 was set as the largest acceptable value for the probability of type II errors. In total, 280 articles were screened. Final analysis included 50 prospective, randomized controlled trials using unpaired t tests. The median power of tests to detect a 50% difference between means was 0.9 and was the same for all 4 journals regardless of the statistical significance of the test. The median power of tests to detect a 20% difference between means ranged from 0.26 to 0.9 for the 4 journals. The median power of these tests to detect a 50% and 20% difference between means was 0.9 and 0.5 for tests that did not achieve statistical significance. A total of 14% and 57% of articles with negative unpaired t tests contained results with β > 0.3 when power was calculated for differences between means of 50% and 20%, respectively. A large portion of studies demonstrate high probabilities of type II errors when detecting small differences between means. The power to detect small difference between means varies across journals. It is, therefore, worthwhile for authors to mention the minimum clinically important difference for individual studies. Journals can consider publishing statistical guidelines for authors to use. Day-to-day clinical decisions rely heavily on the evidence base formed by the plethora of studies available to clinicians. Prospective, randomized controlled clinical trials are highly regarded as a robust study and are used to make important clinical decisions that directly affect patient care. The quality of study designs and statistical methods in major clinical journals is improving overtime, 1 and researchers and journals are being more attentive to statistical methodologies incorporated by studies. The results of well-designed ophthalmic studies with robust methodologies, therefore, have the ability to modify the ways in which diseases are managed. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.

  20. Evaluating Neurotoxicity of a Mixture of Five OP Pesticides Using a Composite Score

    EPA Science Inventory

    The evaluation of the cumulative effects of neurotoxic pesticides often involves the analysis of both neurochemical and behavioral endpoints. Multiple statistical tests on many endpoints can greatly inflate Type I error rates. Multiple comparison adjustments are often overly con...

  1. Principal component analysis and neurocomputing-based models for total ozone concentration over different urban regions of India

    NASA Astrophysics Data System (ADS)

    Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi

    2012-07-01

    The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.

  2. EvolQG - An R package for evolutionary quantitative genetics

    PubMed Central

    Melo, Diogo; Garcia, Guilherme; Hubbe, Alex; Assis, Ana Paula; Marroig, Gabriel

    2016-01-01

    We present an open source package for performing evolutionary quantitative genetics analyses in the R environment for statistical computing. Evolutionary theory shows that evolution depends critically on the available variation in a given population. When dealing with many quantitative traits this variation is expressed in the form of a covariance matrix, particularly the additive genetic covariance matrix or sometimes the phenotypic matrix, when the genetic matrix is unavailable and there is evidence the phenotypic matrix is sufficiently similar to the genetic matrix. Given this mathematical representation of available variation, the \\textbf{EvolQG} package provides functions for calculation of relevant evolutionary statistics; estimation of sampling error; corrections for this error; matrix comparison via correlations, distances and matrix decomposition; analysis of modularity patterns; and functions for testing evolutionary hypotheses on taxa diversification. PMID:27785352

  3. Discovering Randomness, Recovering Expertise: The Different Approaches to the Quality in Measurement of Coulomb and Gauss and of Today's Students

    ERIC Educational Resources Information Center

    Heinicke, Susanne; Heering, Peter

    2013-01-01

    The aim of this paper is to discuss different approaches to the quality (or uncertainty) of measurement data considering both historical examples and today's students' views. Today's teaching of data analysis is very much focussed on the application of statistical routines (often called the "Gaussian approach" to error analysis). Studies on…

  4. Consequences of Assumption Violations Revisited: A Quantitative Review of Alternatives to the One-Way Analysis of Variance "F" Test.

    ERIC Educational Resources Information Center

    Lix, Lisa M.; And Others

    1996-01-01

    Meta-analytic techniques were used to summarize the statistical robustness literature on Type I error properties of alternatives to the one-way analysis of variance "F" test. The James (1951) and Welch (1951) tests performed best under violations of the variance homogeneity assumption, although their use is not always appropriate. (SLD)

  5. On the Choice of Variable for Atmospheric Moisture Analysis

    NASA Technical Reports Server (NTRS)

    Dee, Dick P.; DaSilva, Arlindo M.; Atlas, Robert (Technical Monitor)

    2002-01-01

    The implications of using different control variables for the analysis of moisture observations in a global atmospheric data assimilation system are investigated. A moisture analysis based on either mixing ratio or specific humidity is prone to large extrapolation errors, due to the high variability in space and time of these parameters and to the difficulties in modeling their error covariances. Using the logarithm of specific humidity does not alleviate these problems, and has the further disadvantage that very dry background estimates cannot be effectively corrected by observations. Relative humidity is a better choice from a statistical point of view, because this field is spatially and temporally more coherent and error statistics are therefore easier to obtain. If, however, the analysis is designed to preserve relative humidity in the absence of moisture observations, then the analyzed specific humidity field depends entirely on analyzed temperature changes. If the model has a cool bias in the stratosphere this will lead to an unstable accumulation of excess moisture there. A pseudo-relative humidity can be defined by scaling the mixing ratio by the background saturation mixing ratio. A univariate pseudo-relative humidity analysis will preserve the specific humidity field in the absence of moisture observations. A pseudorelative humidity analysis is shown to be equivalent to a mixing ratio analysis with flow-dependent covariances. In the presence of multivariate (temperature-moisture) observations it produces analyzed relative humidity values that are nearly identical to those produced by a relative humidity analysis. Based on a time series analysis of radiosonde observed-minus-background differences it appears to be more justifiable to neglect specific humidity-temperature correlations (in a univariate pseudo-relative humidity analysis) than to neglect relative humidity-temperature correlations (in a univariate relative humidity analysis). A pseudo-relative humidity analysis is easily implemented in an existing moisture analysis system, by simply scaling observed-minus background moisture residuals prior to solving the analysis equation, and rescaling the analyzed increments afterward.

  6. Voxel-based statistical analysis of uncertainties associated with deformable image registration

    NASA Astrophysics Data System (ADS)

    Li, Shunshan; Glide-Hurst, Carri; Lu, Mei; Kim, Jinkoo; Wen, Ning; Adams, Jeffrey N.; Gordon, James; Chetty, Indrin J.; Zhong, Hualiang

    2013-09-01

    Deformable image registration (DIR) algorithms have inherent uncertainties in their displacement vector fields (DVFs).The purpose of this study is to develop an optimal metric to estimate DIR uncertainties. Six computational phantoms have been developed from the CT images of lung cancer patients using a finite element method (FEM). The FEM generated DVFs were used as a standard for registrations performed on each of these phantoms. A mechanics-based metric, unbalanced energy (UE), was developed to evaluate these registration DVFs. The potential correlation between UE and DIR errors was explored using multivariate analysis, and the results were validated by landmark approach and compared with two other error metrics: DVF inverse consistency (IC) and image intensity difference (ID). Landmark-based validation was performed using the POPI-model. The results show that the Pearson correlation coefficient between UE and DIR error is rUE-error = 0.50. This is higher than rIC-error = 0.29 for IC and DIR error and rID-error = 0.37 for ID and DIR error. The Pearson correlation coefficient between UE and the product of the DIR displacements and errors is rUE-error × DVF = 0.62 for the six patients and rUE-error × DVF = 0.73 for the POPI-model data. It has been demonstrated that UE has a strong correlation with DIR errors, and the UE metric outperforms the IC and ID metrics in estimating DIR uncertainties. The quantified UE metric can be a useful tool for adaptive treatment strategies, including probability-based adaptive treatment planning.

  7. Can Family Planning Service Statistics Be Used to Track Population-Level Outcomes?

    PubMed

    Magnani, Robert J; Ross, John; Williamson, Jessica; Weinberger, Michelle

    2018-03-21

    The need for annual family planning program tracking data under the Family Planning 2020 (FP2020) initiative has contributed to renewed interest in family planning service statistics as a potential data source for annual estimates of the modern contraceptive prevalence rate (mCPR). We sought to assess (1) how well a set of commonly recorded data elements in routine service statistics systems could, with some fairly simple adjustments, track key population-level outcome indicators, and (2) whether some data elements performed better than others. We used data from 22 countries in Africa and Asia to analyze 3 data elements collected from service statistics: (1) number of contraceptive commodities distributed to clients, (2) number of family planning service visits, and (3) number of current contraceptive users. Data quality was assessed via analysis of mean square errors, using the United Nations Population Division World Contraceptive Use annual mCPR estimates as the "gold standard." We also examined the magnitude of several components of measurement error: (1) variance, (2) level bias, and (3) slope (or trend) bias. Our results indicate modest levels of tracking error for data on commodities to clients (7%) and service visits (10%), and somewhat higher error rates for data on current users (19%). Variance and slope bias were relatively small for all data elements. Level bias was by far the largest contributor to tracking error. Paired comparisons of data elements in countries that collected at least 2 of the 3 data elements indicated a modest advantage of data on commodities to clients. None of the data elements considered was sufficiently accurate to be used to produce reliable stand-alone annual estimates of mCPR. However, the relatively low levels of variance and slope bias indicate that trends calculated from these 3 data elements can be productively used in conjunction with the Family Planning Estimation Tool (FPET) currently used to produce annual mCPR tracking estimates for FP2020. © Magnani et al.

  8. Quantum error-correction failure distributions: Comparison of coherent and stochastic error models

    NASA Astrophysics Data System (ADS)

    Barnes, Jeff P.; Trout, Colin J.; Lucarelli, Dennis; Clader, B. D.

    2017-06-01

    We compare failure distributions of quantum error correction circuits for stochastic errors and coherent errors. We utilize a fully coherent simulation of a fault-tolerant quantum error correcting circuit for a d =3 Steane and surface code. We find that the output distributions are markedly different for the two error models, showing that no simple mapping between the two error models exists. Coherent errors create very broad and heavy-tailed failure distributions. This suggests that they are susceptible to outlier events and that mean statistics, such as pseudothreshold estimates, may not provide the key figure of merit. This provides further statistical insight into why coherent errors can be so harmful for quantum error correction. These output probability distributions may also provide a useful metric that can be utilized when optimizing quantum error correcting codes and decoding procedures for purely coherent errors.

  9. Sampling Errors in Monthly Rainfall Totals for TRMM and SSM/I, Based on Statistics of Retrieved Rain Rates and Simple Models

    NASA Technical Reports Server (NTRS)

    Bell, Thomas L.; Kundu, Prasun K.; Einaudi, Franco (Technical Monitor)

    2000-01-01

    Estimates from TRMM satellite data of monthly total rainfall over an area are subject to substantial sampling errors due to the limited number of visits to the area by the satellite during the month. Quantitative comparisons of TRMM averages with data collected by other satellites and by ground-based systems require some estimate of the size of this sampling error. A method of estimating this sampling error based on the actual statistics of the TRMM observations and on some modeling work has been developed. "Sampling error" in TRMM monthly averages is defined here relative to the monthly total a hypothetical satellite permanently stationed above the area would have reported. "Sampling error" therefore includes contributions from the random and systematic errors introduced by the satellite remote sensing system. As part of our long-term goal of providing error estimates for each grid point accessible to the TRMM instruments, sampling error estimates for TRMM based on rain retrievals from TRMM microwave (TMI) data are compared for different times of the year and different oceanic areas (to minimize changes in the statistics due to algorithmic differences over land and ocean). Changes in sampling error estimates due to changes in rain statistics due 1) to evolution of the official algorithms used to process the data, and 2) differences from other remote sensing systems such as the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave/Imager (SSM/I), are analyzed.

  10. Explanation of Two Anomalous Results in Statistical Mediation Analysis.

    PubMed

    Fritz, Matthew S; Taylor, Aaron B; Mackinnon, David P

    2012-01-01

    Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M , a , increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y , b , was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a . Implications of these findings are discussed.

  11. A novel measure and significance testing in data analysis of cell image segmentation.

    PubMed

    Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L

    2017-03-14

    Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.

  12. Risk factors for refractive errors in primary school children (6-12 years old) in Nakhon Pathom Province.

    PubMed

    Yingyong, Penpimol

    2010-11-01

    Refractive error is one of the leading causes of visual impairment in children. An analysis of risk factors for refractive error is required to reduce and prevent this common eye disease. To identify the risk factors associated with refractive errors in primary school children (6-12 year old) in Nakhon Pathom province. A population-based cross-sectional analytic study was conducted between October 2008 and September 2009 in Nakhon Pathom. Refractive error, parental refractive status, and hours per week of near activities (studying, reading books, watching television, playing with video games, or working on the computer) were assessed in 377 children who participated in this study. The most common type of refractive error in primary school children was myopia. Myopic children were more likely to have parents with myopia. Children with myopia spend more time at near activities. The multivariate odds ratio (95% confidence interval)for two myopic parents was 6.37 (2.26-17.78) and for each diopter-hour per week of near work was 1.019 (1.005-1.033). Multivariate logistic regression models show no confounding effects between parental myopia and near work suggesting that each factor has an independent association with myopia. Statistical analysis by logistic regression revealed that family history of refractive error and hours of near-work were significantly associated with refractive error in primary school children.

  13. Improving the Glucose Meter Error Grid With the Taguchi Loss Function.

    PubMed

    Krouwer, Jan S

    2016-07-01

    Glucose meters often have similar performance when compared by error grid analysis. This is one reason that other statistics such as mean absolute relative deviation (MARD) are used to further differentiate performance. The problem with MARD is that too much information is lost. But additional information is available within the A zone of an error grid by using the Taguchi loss function. Applying the Taguchi loss function gives each glucose meter difference from reference a value ranging from 0 (no error) to 1 (error reaches the A zone limit). Values are averaged over all data which provides an indication of risk of an incorrect medical decision. This allows one to differentiate glucose meter performance for the common case where meters have a high percentage of values in the A zone and no values beyond the B zone. Examples are provided using simulated data. © 2015 Diabetes Technology Society.

  14. Biostatistical analysis of quantitative immunofluorescence microscopy images.

    PubMed

    Giles, C; Albrecht, M A; Lam, V; Takechi, R; Mamo, J C

    2016-12-01

    Semiquantitative immunofluorescence microscopy has become a key methodology in biomedical research. Typical statistical workflows are considered in the context of avoiding pseudo-replication and marginalising experimental error. However, immunofluorescence microscopy naturally generates hierarchically structured data that can be leveraged to improve statistical power and enrich biological interpretation. Herein, we describe a robust distribution fitting procedure and compare several statistical tests, outlining their potential advantages/disadvantages in the context of biological interpretation. Further, we describe tractable procedures for power analysis that incorporates the underlying distribution, sample size and number of images captured per sample. The procedures outlined have significant potential for increasing understanding of biological processes and decreasing both ethical and financial burden through experimental optimization. © 2016 The Authors Journal of Microscopy © 2016 Royal Microscopical Society.

  15. Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity

    PubMed Central

    Beasley, T. Mark

    2013-01-01

    Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952

  16. Visualization and statistical comparisons of microbial communities using R packages on Phylochip data.

    PubMed

    Holmes, Susan; Alekseyenko, Alexander; Timme, Alden; Nelson, Tyrrell; Pasricha, Pankaj Jay; Spormann, Alfred

    2011-01-01

    This article explains the statistical and computational methodology used to analyze species abundances collected using the LNBL Phylochip in a study of Irritable Bowel Syndrome (IBS) in rats. Some tools already available for the analysis of ordinary microarray data are useful in this type of statistical analysis. For instance in correcting for multiple testing we use Family Wise Error rate control and step-down tests (available in the multtest package). Once the most significant species are chosen we use the hypergeometric tests familiar for testing GO categories to test specific phyla and families. We provide examples of normalization, multivariate projections, batch effect detection and integration of phylogenetic covariation, as well as tree equalization and robustification methods.

  17. Evaluating measurement models in clinical research: covariance structure analysis of latent variable models of self-conception.

    PubMed

    Hoyle, R H

    1991-02-01

    Indirect measures of psychological constructs are vital to clinical research. On occasion, however, the meaning of indirect measures of psychological constructs is obfuscated by statistical procedures that do not account for the complex relations between items and latent variables and among latent variables. Covariance structure analysis (CSA) is a statistical procedure for testing hypotheses about the relations among items that indirectly measure a psychological construct and relations among psychological constructs. This article introduces clinical researchers to the strengths and limitations of CSA as a statistical procedure for conceiving and testing structural hypotheses that are not tested adequately with other statistical procedures. The article is organized around two empirical examples that illustrate the use of CSA for evaluating measurement models with correlated error terms, higher-order factors, and measured and latent variables.

  18. The statistical evaluation of duct tape end match as physical evidence

    NASA Astrophysics Data System (ADS)

    Chan, Ka Lok

    Duct tapes are often submitted to crime laboratories as evidence associated with abductions, homicides, or construction of explosive devices. As a result, trace evidence examiners are often asked to analyze and compare commercial duct tapes so that they can establish possible evidentiary links. Duct tape end matches are believed to be the strongest association between exemplar and question samples because they are considered as evidence with unique individual characteristics. While end match analysis and comparison have long been undertaken by trace evidence examiners, there is a significant lack of scientific research for associating two or more segments of duct tapes. This study is designed to obtain statistical inferences on the uniqueness of duct tape tears. Three experiments were devised to compile the basis for a statistical assessment of the probability of duct tape end matches along with a proposed error rate. In one experiment, we conducted the equivalent of 10,000 end match examinations with an error rate of 0%. In the second experiment, we performed 2,704 end match examinations having 0% error rate. In the third experiment, using duct tape by an Elmendorf Tear tester, we conducted 576 end match examinations with an error rate of 0% and having all samples correctly associated. The results of this study indicate that end matches are distinguishable among a single roll of duct tape and between two different rolls of duct tape having very similar surface features and weave pattern.

  19. Using meta-information of a posteriori Bayesian solutions of the hypocentre location task for improving accuracy of location error estimation

    NASA Astrophysics Data System (ADS)

    Debski, Wojciech

    2015-06-01

    The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analysed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. Although estimating of the earthquake foci location is relatively simple, a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling and a priori uncertainties. In this paper, we addressed this task when statistics of observational and/or modelling errors are unknown. This common situation requires introduction of a priori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland, we propose an approach based on an analysis of Shanon's entropy calculated for the a posteriori distribution. We show that this meta-characteristic of the a posteriori distribution carries some information on uncertainties of the solution found.

  20. Implementation of neural network for color properties of polycarbonates

    NASA Astrophysics Data System (ADS)

    Saeed, U.; Ahmad, S.; Alsadi, J.; Ross, D.; Rizvi, G.

    2014-05-01

    In present paper, the applicability of artificial neural networks (ANN) is investigated for color properties of plastics. The neural networks toolbox of Matlab 6.5 is used to develop and test the ANN model on a personal computer. An optimal design is completed for 10, 12, 14,16,18 & 20 hidden neurons on single hidden layer with five different algorithms: batch gradient descent (GD), batch variable learning rate (GDX), resilient back-propagation (RP), scaled conjugate gradient (SCG), levenberg-marquardt (LM) in the feed forward back-propagation neural network model. The training data for ANN is obtained from experimental measurements. There were twenty two inputs including resins, additives & pigments while three tristimulus color values L*, a* and b* were used as output layer. Statistical analysis in terms of Root-Mean-Squared (RMS), absolute fraction of variance (R squared), as well as mean square error is used to investigate the performance of ANN. LM algorithm with fourteen neurons on hidden layer in Feed Forward Back-Propagation of ANN model has shown best result in the present study. The degree of accuracy of the ANN model in reduction of errors is proven acceptable in all statistical analysis and shown in results. However, it was concluded that ANN provides a feasible method in error reduction in specific color tristimulus values.

  1. Robust Mean and Covariance Structure Analysis through Iteratively Reweighted Least Squares.

    ERIC Educational Resources Information Center

    Yuan, Ke-Hai; Bentler, Peter M.

    2000-01-01

    Adapts robust schemes to mean and covariance structures, providing an iteratively reweighted least squares approach to robust structural equation modeling. Each case is weighted according to its distance, based on first and second order moments. Test statistics and standard error estimators are given. (SLD)

  2. Repeat-aware modeling and correction of short read errors.

    PubMed

    Yang, Xiao; Aluru, Srinivas; Dorman, Karin S

    2011-02-15

    High-throughput short read sequencing is revolutionizing genomics and systems biology research by enabling cost-effective deep coverage sequencing of genomes and transcriptomes. Error detection and correction are crucial to many short read sequencing applications including de novo genome sequencing, genome resequencing, and digital gene expression analysis. Short read error detection is typically carried out by counting the observed frequencies of kmers in reads and validating those with frequencies exceeding a threshold. In case of genomes with high repeat content, an erroneous kmer may be frequently observed if it has few nucleotide differences with valid kmers with multiple occurrences in the genome. Error detection and correction were mostly applied to genomes with low repeat content and this remains a challenging problem for genomes with high repeat content. We develop a statistical model and a computational method for error detection and correction in the presence of genomic repeats. We propose a method to infer genomic frequencies of kmers from their observed frequencies by analyzing the misread relationships among observed kmers. We also propose a method to estimate the threshold useful for validating kmers whose estimated genomic frequency exceeds the threshold. We demonstrate that superior error detection is achieved using these methods. Furthermore, we break away from the common assumption of uniformly distributed errors within a read, and provide a framework to model position-dependent error occurrence frequencies common to many short read platforms. Lastly, we achieve better error correction in genomes with high repeat content. The software is implemented in C++ and is freely available under GNU GPL3 license and Boost Software V1.0 license at "http://aluru-sun.ece.iastate.edu/doku.php?id = redeem". We introduce a statistical framework to model sequencing errors in next-generation reads, which led to promising results in detecting and correcting errors for genomes with high repeat content.

  3. Error tolerance analysis of wave diagnostic based on coherent modulation imaging in high power laser system

    NASA Astrophysics Data System (ADS)

    Pan, Xingchen; Liu, Cheng; Zhu, Jianqiang

    2018-02-01

    Coherent modulation imaging providing fast convergence speed and high resolution with single diffraction pattern is a promising technique to satisfy the urgent demands for on-line multiple parameter diagnostics with single setup in high power laser facilities (HPLF). However, the influence of noise on the final calculated parameters concerned has not been investigated yet. According to a series of simulations with twenty different sampling beams generated based on the practical parameters and performance of HPLF, the quantitative analysis based on statistical results was first investigated after considering five different error sources. We found the background noise of detector and high quantization error will seriously affect the final accuracy and different parameters have different sensitivity to different noise sources. The simulation results and the corresponding analysis provide the potential directions to further improve the final accuracy of parameter diagnostics which is critically important to its formal applications in the daily routines of HPLF.

  4. Type I error probability spending for post-market drug and vaccine safety surveillance with binomial data.

    PubMed

    Silva, Ivair R

    2018-01-15

    Type I error probability spending functions are commonly used for designing sequential analysis of binomial data in clinical trials, but it is also quickly emerging for near-continuous sequential analysis of post-market drug and vaccine safety surveillance. It is well known that, for clinical trials, when the null hypothesis is not rejected, it is still important to minimize the sample size. Unlike in post-market drug and vaccine safety surveillance, that is not important. In post-market safety surveillance, specially when the surveillance involves identification of potential signals, the meaningful statistical performance measure to be minimized is the expected sample size when the null hypothesis is rejected. The present paper shows that, instead of the convex Type I error spending shape conventionally used in clinical trials, a concave shape is more indicated for post-market drug and vaccine safety surveillance. This is shown for both, continuous and group sequential analysis. Copyright © 2017 John Wiley & Sons, Ltd.

  5. Emergency department discharge prescription errors in an academic medical center

    PubMed Central

    Belanger, April; Devine, Lauren T.; Lane, Aaron; Condren, Michelle E.

    2017-01-01

    This study described discharge prescription medication errors written for emergency department patients. This study used content analysis in a cross-sectional design to systematically categorize prescription errors found in a report of 1000 discharge prescriptions submitted in the electronic medical record in February 2015. Two pharmacy team members reviewed the discharge prescription list for errors. Open-ended data were coded by an additional rater for agreement on coding categories. Coding was based upon majority rule. Descriptive statistics were used to address the study objective. Categories evaluated were patient age, provider type, drug class, and type and time of error. The discharge prescription error rate out of 1000 prescriptions was 13.4%, with “incomplete or inadequate prescription” being the most commonly detected error (58.2%). The adult and pediatric error rates were 11.7% and 22.7%, respectively. The antibiotics reviewed had the highest number of errors. The highest within-class error rates were with antianginal medications, antiparasitic medications, antacids, appetite stimulants, and probiotics. Emergency medicine residents wrote the highest percentage of prescriptions (46.7%) and had an error rate of 9.2%. Residents of other specialties wrote 340 prescriptions and had an error rate of 20.9%. Errors occurred most often between 10:00 am and 6:00 pm. PMID:28405061

  6. Multi-year objective analyses of warm season ground-level ozone and PM2.5 over North America using real-time observations and Canadian operational air quality models

    NASA Astrophysics Data System (ADS)

    Robichaud, A.; Ménard, R.

    2013-05-01

    We present multi-year objective analyses (OA) on a high spatio-temporal resolution (15 or 21 km, every hour) for the warm season period (1 May-31 October) for ground-level ozone (2002-2012) and for fine particulate matter (diameter less than 2.5 microns (PM2.5)) (2004-2012). The OA used here combines the Canadian Air Quality forecast suite with US and Canadian surface air quality monitoring sites. The analysis is based on an optimal interpolation with capabilities for adaptive error statistics for ozone and PM2.5 and an explicit bias correction scheme for the PM2.5 analyses. The estimation of error statistics has been computed using a modified version of the Hollingsworth-Lönnberg's (H-L) method. Various quality controls (gross error check, sudden jump test and background check) have been applied to the observations to remove outliers. An additional quality control is applied to check the consistency of the error statistics estimation model at each observing station and for each hour. The error statistics are further tuned "on the fly" using a χ2 (chi-square) diagnostic, a procedure which verifies significantly better than without tuning. Successful cross-validation experiments were performed with an OA set-up using 90% of observations to build the objective analysis and with the remainder left out as an independent set of data for verification purposes. Furthermore, comparisons with other external sources of information (global models and PM2.5 satellite surface derived measurements) show reasonable agreement. The multi-year analyses obtained provide relatively high precision with an absolute yearly averaged systematic error of less than 0.6 ppbv (parts per billion by volume) and 0.7 μg m-3 (micrograms per cubic meter) for ozone and PM2.5 respectively and a random error generally less than 9 ppbv for ozone and under 12 μg m-3 for PM2.5. In this paper, we focus on two applications: (1) presenting long term averages of objective analysis and analysis increments as a form of summer climatology and (2) analyzing long term (decadal) trends and inter-annual fluctuations using OA outputs. Our results show that high percentiles of ozone and PM2.5 are both following a decreasing trend overall in North America with the eastern part of United States (US) presenting the highest decrease likely due to more effective pollution controls. Some locations, however, exhibited an increasing trend in the mean ozone and PM2.5 such as the northwestern part of North America (northwest US and Alberta). The low percentiles are generally rising for ozone which may be linked to increasing emissions from emerging countries and the resulting pollution brought by the intercontinental transport. After removing the decadal trend, we demonstrate that the inter-annual fluctuations of the high percentiles are significantly correlated with temperature fluctuations for ozone and precipitation fluctuations for PM2.5. We also show that there was a moderately significant correlation between the inter-annual fluctuations of the high percentiles of ozone and PM2.5 with economic indices such as the Industrial Dow Jones and/or the US gross domestic product growth rate.

  7. Accounting for rate instability and spatial patterns in the boundary analysis of cancer mortality maps

    PubMed Central

    Goovaerts, Pierre

    2006-01-01

    Boundary analysis of cancer maps may highlight areas where causative exposures change through geographic space, the presence of local populations with distinct cancer incidences, or the impact of different cancer control methods. Too often, such analysis ignores the spatial pattern of incidence or mortality rates and overlooks the fact that rates computed from sparsely populated geographic entities can be very unreliable. This paper proposes a new methodology that accounts for the uncertainty and spatial correlation of rate data in the detection of significant edges between adjacent entities or polygons. Poisson kriging is first used to estimate the risk value and the associated standard error within each polygon, accounting for the population size and the risk semivariogram computed from raw rates. The boundary statistic is then defined as half the absolute difference between kriged risks. Its reference distribution, under the null hypothesis of no boundary, is derived through the generation of multiple realizations of the spatial distribution of cancer risk values. This paper presents three types of neutral models generated using methods of increasing complexity: the common random shuffle of estimated risk values, a spatial re-ordering of these risks, or p-field simulation that accounts for the population size within each polygon. The approach is illustrated using age-adjusted pancreatic cancer mortality rates for white females in 295 US counties of the Northeast (1970–1994). Simulation studies demonstrate that Poisson kriging yields more accurate estimates of the cancer risk and how its value changes between polygons (i.e. boundary statistic), relatively to the use of raw rates or local empirical Bayes smoother. When used in conjunction with spatial neutral models generated by p-field simulation, the boundary analysis based on Poisson kriging estimates minimizes the proportion of type I errors (i.e. edges wrongly declared significant) while the frequency of these errors is predicted well by the p-value of the statistical test. PMID:19023455

  8. Statistical methods for astronomical data with upper limits. I - Univariate distributions

    NASA Technical Reports Server (NTRS)

    Feigelson, E. D.; Nelson, P. I.

    1985-01-01

    The statistical treatment of univariate censored data is discussed. A heuristic derivation of the Kaplan-Meier maximum-likelihood estimator from first principles is presented which results in an expression amenable to analytic error analysis. Methods for comparing two or more censored samples are given along with simple computational examples, stressing the fact that most astronomical problems involve upper limits while the standard mathematical methods require lower limits. The application of univariate survival analysis to six data sets in the recent astrophysical literature is described, and various aspects of the use of survival analysis in astronomy, such as the limitations of various two-sample tests and the role of parametric modelling, are discussed.

  9. A neural network for real-time retrievals of PWV and LWP from Arctic millimeter-wave ground-based observations.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cadeddu, M. P.; Turner, D. D.; Liljegren, J. C.

    2009-07-01

    This paper presents a new neural network (NN) algorithm for real-time retrievals of low amounts of precipitable water vapor (PWV) and integrated liquid water from millimeter-wave ground-based observations. Measurements are collected by the 183.3-GHz G-band vapor radiometer (GVR) operating at the Atmospheric Radiation Measurement (ARM) Program Climate Research Facility, Barrow, AK. The NN provides the means to explore the nonlinear regime of the measurements and investigate the physical boundaries of the operability of the instrument. A methodology to compute individual error bars associated with the NN output is developed, and a detailed error analysis of the network output is provided.more » Through the error analysis, it is possible to isolate several components contributing to the overall retrieval errors and to analyze the dependence of the errors on the inputs. The network outputs and associated errors are then compared with results from a physical retrieval and with the ARM two-channel microwave radiometer (MWR) statistical retrieval. When the NN is trained with a seasonal training data set, the retrievals of water vapor yield results that are comparable to those obtained from a traditional physical retrieval, with a retrieval error percentage of {approx}5% when the PWV is between 2 and 10 mm, but with the advantages that the NN algorithm does not require vertical profiles of temperature and humidity as input and is significantly faster computationally. Liquid water path (LWP) retrievals from the NN have a significantly improved clear-sky bias (mean of {approx}2.4 g/m{sup 2}) and a retrieval error varying from 1 to about 10 g/m{sup 2} when the PWV amount is between 1 and 10 mm. As an independent validation of the LWP retrieval, the longwave downwelling surface flux was computed and compared with observations. The comparison shows a significant improvement with respect to the MWR statistical retrievals, particularly for LWP amounts of less than 60 g/m{sup 2}.« less

  10. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  11. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  12. Evaluation of trauma care using TRISS method: the role of adjusted misclassification rate and adjusted w-statistic.

    PubMed

    Llullaku, Sadik S; Hyseni, Nexhmi Sh; Bytyçi, Cen I; Rexhepi, Sylejman K

    2009-01-15

    Major trauma is a leading cause of death worldwide. Evaluation of trauma care using Trauma Injury and Injury Severity Score (TRISS) method is focused in trauma outcome (deaths and survivors). For testing TRISS method TRISS misclassification rate is used. Calculating w-statistic, as a difference between observed and TRISS expected survivors, we compare our trauma care results with the TRISS standard. The aim of this study is to analyze interaction between misclassification rate and w-statistic and to adjust these parameters to be closer to the truth. Analysis of components of TRISS misclassification rate and w-statistic and actual trauma outcome. The component of false negative (FN) (by TRISS method unexpected deaths) has two parts: preventable (Pd) and non-preventable (nonPd) trauma deaths. Pd represents inappropriate trauma care of an institution; otherwise nonpreventable trauma deaths represents errors in TRISS method. Removing patients with preventable trauma deaths we get an Adjusted misclassification rate: (FP + FN - Pd)/N or (b+c-Pd)/N. Substracting nonPd from FN value in w-statistic formula we get an Adjusted w-statistic: [FP-(FN - nonPd)]/N, respectively (FP-Pd)/N, or (b-Pd)/N). Because adjusted formulas clean method from inappropriate trauma care, and clean trauma care from the methods error, TRISS adjusted misclassification rate and adjusted w-statistic gives more realistic results and may be used in researches of trauma outcome.

  13. Resting-state fMRI data reflects default network activity rather than null data: A defense of commonly employed methods to correct for multiple comparisons.

    PubMed

    Slotnick, Scott D

    2017-07-01

    Analysis of functional magnetic resonance imaging (fMRI) data typically involves over one hundred thousand independent statistical tests; therefore, it is necessary to correct for multiple comparisons to control familywise error. In a recent paper, Eklund, Nichols, and Knutsson used resting-state fMRI data to evaluate commonly employed methods to correct for multiple comparisons and reported unacceptable rates of familywise error. Eklund et al.'s analysis was based on the assumption that resting-state fMRI data reflect null data; however, their 'null data' actually reflected default network activity that inflated familywise error. As such, Eklund et al.'s results provide no basis to question the validity of the thousands of published fMRI studies that have corrected for multiple comparisons or the commonly employed methods to correct for multiple comparisons.

  14. Theoretical Aspects of the Patterns Recognition Statistical Theory Used for Developing the Diagnosis Algorithms for Complicated Technical Systems

    NASA Astrophysics Data System (ADS)

    Obozov, A. A.; Serpik, I. N.; Mihalchenko, G. S.; Fedyaeva, G. A.

    2017-01-01

    In the article, the problem of application of the pattern recognition (a relatively young area of engineering cybernetics) for analysis of complicated technical systems is examined. It is shown that the application of a statistical approach for hard distinguishable situations could be the most effective. The different recognition algorithms are based on Bayes approach, which estimates posteriori probabilities of a certain event and an assumed error. Application of the statistical approach to pattern recognition is possible for solving the problem of technical diagnosis complicated systems and particularly big powered marine diesel engines.

  15. Peripheral refraction and image blur in four meridians in emmetropes and myopes.

    PubMed

    Shen, Jie; Spors, Frank; Egan, Donald; Liu, Chunming

    2018-01-01

    The peripheral refractive error of the human eye has been hypothesized to be a major stimulus for the development of its central refractive error. The purpose of this study was to investigate the changes in the peripheral refractive error across horizontal, vertical and two diagonal meridians in emmetropic and low, moderate and high myopic adults. Thirty-four adult subjects were recruited and aberration was measured using a modified commercial aberrometer. We then computed the refractive error in power vector notation from second-order Zernike terms. Statistical analysis was performed to evaluate the statistical differences in refractive error profiles between the subject groups and across all measured visual field meridians. Small amounts of relative myopic shift were observed in emmetropic and low myopic subjects. However, moderate and high myopic subjects exhibited a relative hyperopic shift in all four meridians. Astigmatism J 0 and J 45 had quadratic or linear changes dependent on the visual field meridians. Peripheral Sphero-Cylindrical Retinal Image Blur increased in emmetropic eyes in most of the measured visual fields. The findings indicate an overall emmetropic or slightly relative myopic periphery (spherical or oblate retinal shape) formed in emmetropes and low myopes, while moderate and high myopes form relative hyperopic periphery (prolate, or less oblate, retinal shape). In general, human emmetropic eyes demonstrate higher amount of peripheral retinal image blur.

  16. Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis

    ERIC Educational Resources Information Center

    Marin-Martinez, Fulgencio; Sanchez-Meca, Julio

    2010-01-01

    Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…

  17. The microcomputer scientific software series 3: general linear model--analysis of variance.

    Treesearch

    Harold M. Rauscher

    1985-01-01

    A BASIC language set of programs, designed for use on microcomputers, is presented. This set of programs will perform the analysis of variance for any statistical model describing either balanced or unbalanced designs. The program computes and displays the degrees of freedom, Type I sum of squares, and the mean square for the overall model, the error, and each factor...

  18. Intelligence/Electronic Warfare (IEW) Direction-Finding and Fix Estimation Analysis Report. Volume 2. Trailblazer

    DTIC Science & Technology

    1985-12-20

    Report) Approved for Public Disemination I 17. DISTRIBUTION STATEMENT (of the abstract entered In Block 20, It different from Report) I1. SUPPLEMENTARY...Continue an riverl. aid. It neceseary ind Idoni..•y by block number) Fix Estimation Statistical Assumptions, Error Budget, Unnodclcd Errors, Coding...llgedl i t Eh’ fI) t r !". 1 I ’ " r, tl 1: a Icr it h m hc ro ,, ] y zcd arc Csedil other Current TIV! Sysem ’ he report examines the underlying

  19. The statistical analysis of global climate change studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hardin, J.W.

    1992-01-01

    The focus of this work is to contribute to the enhancement of the relationship between climatologists and statisticians. The analysis of global change data has been underway for many years by atmospheric scientists. Much of this analysis includes a heavy reliance on statistics and statistical inference. Some specific climatological analyses are presented and the dependence on statistics is documented before the analysis is undertaken. The first problem presented involves the fluctuation-dissipation theorem and its application to global climate models. This problem has a sound theoretical niche in the literature of both climate modeling and physics, but a statistical analysis inmore » which the data is obtained from the model to show graphically the relationship has not been undertaken. It is under this motivation that the author presents this problem. A second problem concerning the standard errors in estimating global temperatures is purely statistical in nature although very little materials exists for sampling on such a frame. This problem not only has climatological and statistical ramifications, but political ones as well. It is planned to use these results in a further analysis of global warming using actual data collected on the earth. In order to simplify the analysis of these problems, the development of a computer program, MISHA, is presented. This interactive program contains many of the routines, functions, graphics, and map projections needed by the climatologist in order to effectively enter the arena of data visualization.« less

  20. Quantitative comparison of tympanic membrane displacements using two optical methods to recover the optical phase

    NASA Astrophysics Data System (ADS)

    Santiago-Lona, Cynthia V.; Hernández-Montes, María del Socorro; Mendoza-Santoyo, Fernando; Esquivel-Tejeda, Jesús

    2018-02-01

    The study and quantification of the tympanic membrane (TM) displacements add important information to advance the knowledge about the hearing process. A comparative statistical analysis between two commonly used demodulation methods employed to recover the optical phase in digital holographic interferometry, namely the fast Fourier transform and phase-shifting interferometry, is presented as applied to study thin tissues such as the TM. The resulting experimental TM surface displacement data are used to contrast both methods through the analysis of variance and F tests. Data are gathered when the TMs are excited with continuous sound stimuli at levels 86, 89 and 93 dB SPL for the frequencies of 800, 1300 and 2500 Hz under the same experimental conditions. The statistical analysis shows repeatability in z-direction displacements with a standard deviation of 0.086, 0.098 and 0.080 μm using the Fourier method, and 0.080, 0.104 and 0.055 μm with the phase-shifting method at a 95% confidence level for all frequencies. The precision and accuracy are evaluated by means of the coefficient of variation; the results with the Fourier method are 0.06143, 0.06125, 0.06154 and 0.06154, 0.06118, 0.06111 with phase-shifting. The relative error between both methods is 7.143, 6.250 and 30.769%. On comparing the measured displacements, the results indicate that there is no statistically significant difference between both methods for frequencies at 800 and 1300 Hz; however, errors and other statistics increase at 2500 Hz.

  1. LV software support for supersonic flow analysis

    NASA Technical Reports Server (NTRS)

    Bell, W. A.; Lepicovsky, J.

    1992-01-01

    The software for configuring an LV counter processor system has been developed using structured design. The LV system includes up to three counter processors and a rotary encoder. The software for configuring and testing the LV system has been developed, tested, and included in an overall software package for data acquisition, analysis, and reduction. Error handling routines respond to both operator and instrument errors which often arise in the course of measuring complex, high-speed flows. The use of networking capabilities greatly facilitates the software development process by allowing software development and testing from a remote site. In addition, high-speed transfers allow graphics files or commands to provide viewing of the data from a remote site. Further advances in data analysis require corresponding advances in procedures for statistical and time series analysis of nonuniformly sampled data.

  2. LV software support for supersonic flow analysis

    NASA Technical Reports Server (NTRS)

    Bell, William A.

    1992-01-01

    The software for configuring a Laser Velocimeter (LV) counter processor system was developed using structured design. The LV system includes up to three counter processors and a rotary encoder. The software for configuring and testing the LV system was developed, tested, and included in an overall software package for data acquisition, analysis, and reduction. Error handling routines respond to both operator and instrument errors which often arise in the course of measuring complex, high-speed flows. The use of networking capabilities greatly facilitates the software development process by allowing software development and testing from a remote site. In addition, high-speed transfers allow graphics files or commands to provide viewing of the data from a remote site. Further advances in data analysis require corresponding advances in procedures for statistical and time series analysis of nonuniformly sampled data.

  3. Application of the Auto-Tuned Land Assimilation System (ATLAS) to ASCAT and SMOS soil moisture retrieval products

    USDA-ARS?s Scientific Manuscript database

    Land data assimilations are typically based on highly uncertain assumptions regarding the statistical structure of observation and modeling errors. Left uncorrected, poor assumptions can degrade the quality of analysis products generated by land data assimilation systems. Recently, Crow and van de...

  4. Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity

    ERIC Educational Resources Information Center

    Beasley, T. Mark

    2014-01-01

    Increasing the correlation between the independent variable and the mediator ("a" coefficient) increases the effect size ("ab") for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation caused by…

  5. Multilevel Analysis of Structural Equation Models via the EM Algorithm.

    ERIC Educational Resources Information Center

    Jo, See-Heyon

    The question of how to analyze unbalanced hierarchical data generated from structural equation models has been a common problem for researchers and analysts. Among difficulties plaguing statistical modeling are estimation bias due to measurement error and the estimation of the effects of the individual's hierarchical social milieu. This paper…

  6. Errors in logic and statistics plague a meta-analysis

    USDA-ARS?s Scientific Manuscript database

    The non-target effects of transgenic insecticidal crops has been a topic of debate for over a decade and many laboratory and field studies have addressed the issue in numerous countries. In 2009 Lovei et al. (Transgenic Insecticidal Crops and Natural Enemies: A Detailed Review of Laboratory Studies)...

  7. The potential of 2D Kalman filtering for soil moisture data assimilation

    USDA-ARS?s Scientific Manuscript database

    We examine the potential for parameterizing a two-dimensional (2D) land data assimilation system using spatial error auto-correlation statistics gleaned from a triple collocation analysis and the triplet of: (1) active microwave-, (2) passive microwave- and (3) land surface model-based surface soil ...

  8. On P values and effect modification.

    PubMed

    Mayer, Martin

    2017-12-01

    A crucial element of evidence-based healthcare is the sound understanding and use of statistics. As part of instilling sound statistical knowledge and practice, it seems useful to highlight instances of unsound statistical reasoning or practice, not merely in captious or vitriolic spirit, but rather, to use such error as a springboard for edification by giving tangibility to the concepts at hand and highlighting the importance of avoiding such error. This article aims to provide an instructive overview of two key statistical concepts: effect modification and P values. A recent article published in the Journal of the American College of Cardiology on side effects related to statin therapy offers a notable example of errors in understanding effect modification and P values, and although not so critical as to entirely invalidate the article, the errors still demand considerable scrutiny and correction. In doing so, this article serves as an instructive overview of the statistical concepts of effect modification and P values. Judicious handling of statistics is imperative to avoid muddying their utility. This article contributes to the body of literature aiming to improve the use of statistics, which in turn will help facilitate evidence appraisal, synthesis, translation, and application.

  9. Verification of forecast ensembles in complex terrain including observation uncertainty

    NASA Astrophysics Data System (ADS)

    Dorninger, Manfred; Kloiber, Simon

    2017-04-01

    Traditionally, verification means to verify a forecast (ensemble) with the truth represented by observations. The observation errors are quite often neglected arguing that they are small when compared to the forecast error. In this study as part of the MesoVICT (Mesoscale Verification Inter-comparison over Complex Terrain) project it will be shown, that observation errors have to be taken into account for verification purposes. The observation uncertainty is estimated from the VERA (Vienna Enhanced Resolution Analysis) and represented via two analysis ensembles which are compared to the forecast ensemble. For the whole study results from COSMO-LEPS provided by Arpae-SIMC Emilia-Romagna are used as forecast ensemble. The time period covers the MesoVICT core case from 20-22 June 2007. In a first step, all ensembles are investigated concerning their distribution. Several tests have been executed (Kolmogorov-Smirnov-Test, Finkelstein-Schafer Test, Chi-Square Test etc.) showing no exact mathematical distribution. So the main focus is on non-parametric statistics (e.g. Kernel density estimation, Boxplots etc.) and also the deviation between "forced" normal distributed data and the kernel density estimations. In a next step the observational deviations due to the analysis ensembles are analysed. In a first approach scores are multiple times calculated with every single ensemble member from the analysis ensemble regarded as "true" observation. The results are presented as boxplots for the different scores and parameters. Additionally, the bootstrapping method is also applied to the ensembles. These possible approaches to incorporating observational uncertainty into the computation of statistics will be discussed in the talk.

  10. Error Analysis: How Precise is Fused Deposition Modeling in Fabrication of Bone Models in Comparison to the Parent Bones?

    PubMed

    Reddy, M V; Eachempati, Krishnakiran; Gurava Reddy, A V; Mugalur, Aakash

    2018-01-01

    Rapid prototyping (RP) is used widely in dental and faciomaxillary surgery with anecdotal uses in orthopedics. The purview of RP in orthopedics is vast. However, there is no error analysis reported in the literature on bone models generated using office-based RP. This study evaluates the accuracy of fused deposition modeling (FDM) using standard tessellation language (STL) files and errors generated during the fabrication of bone models. Nine dry bones were selected and were computed tomography (CT) scanned. STL files were procured from the CT scans and three-dimensional (3D) models of the bones were printed using our in-house FDM based 3D printer using Acrylonitrile Butadiene Styrene (ABS) filament. Measurements were made on the bone and 3D models according to data collection procedures for forensic skeletal material. Statistical analysis was performed to establish interobserver co-relation for measurements on dry bones and the 3D bone models. Statistical analysis was performed using SPSS version 13.0 software to analyze the collected data. The inter-observer reliability was established using intra-class coefficient for both the dry bones and the 3D models. The mean of absolute difference is 0.4 that is very minimal. The 3D models are comparable to the dry bones. STL file dependent FDM using ABS material produces near-anatomical 3D models. The high 3D accuracy hold a promise in the clinical scenario for preoperative planning, mock surgery, and choice of implants and prostheses, especially in complicated acetabular trauma and complex hip surgeries.

  11. Fine-scale landscape genetics of the American badger (Taxidea taxus): disentangling landscape effects and sampling artifacts in a poorly understood species

    PubMed Central

    Kierepka, E M; Latch, E K

    2016-01-01

    Landscape genetics is a powerful tool for conservation because it identifies landscape features that are important for maintaining genetic connectivity between populations within heterogeneous landscapes. However, using landscape genetics in poorly understood species presents a number of challenges, namely, limited life history information for the focal population and spatially biased sampling. Both obstacles can reduce power in statistics, particularly in individual-based studies. In this study, we genotyped 233 American badgers in Wisconsin at 12 microsatellite loci to identify alternative statistical approaches that can be applied to poorly understood species in an individual-based framework. Badgers are protected in Wisconsin owing to an overall lack in life history information, so our study utilized partial redundancy analysis (RDA) and spatially lagged regressions to quantify how three landscape factors (Wisconsin River, Ecoregions and land cover) impacted gene flow. We also performed simulations to quantify errors created by spatially biased sampling. Statistical analyses first found that geographic distance was an important influence on gene flow, mainly driven by fine-scale positive spatial autocorrelations. After controlling for geographic distance, both RDA and regressions found that Wisconsin River and Agriculture were correlated with genetic differentiation. However, only Agriculture had an acceptable type I error rate (3–5%) to be considered biologically relevant. Collectively, this study highlights the benefits of combining robust statistics and error assessment via simulations and provides a method for hypothesis testing in individual-based landscape genetics. PMID:26243136

  12. Robust LOD scores for variance component-based linkage analysis.

    PubMed

    Blangero, J; Williams, J T; Almasy, L

    2000-01-01

    The variance component method is now widely used for linkage analysis of quantitative traits. Although this approach offers many advantages, the importance of the underlying assumption of multivariate normality of the trait distribution within pedigrees has not been studied extensively. Simulation studies have shown that traits with leptokurtic distributions yield linkage test statistics that exhibit excessive Type I error when analyzed naively. We derive analytical formulae relating the deviation from the expected asymptotic distribution of the lod score to the kurtosis and total heritability of the quantitative trait. A simple correction constant yields a robust lod score for any deviation from normality and for any pedigree structure, and effectively eliminates the problem of inflated Type I error due to misspecification of the underlying probability model in variance component-based linkage analysis.

  13. RCT: Module 2.03, Counting Errors and Statistics, Course 8768

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hillmer, Kurt T.

    2017-04-01

    Radiological sample analysis involves the observation of a random process that may or may not occur and an estimation of the amount of radioactive material present based on that observation. Across the country, radiological control personnel are using the activity measurements to make decisions that may affect the health and safety of workers at those facilities and their surrounding environments. This course will present an overview of measurement processes, a statistical evaluation of both measurements and equipment performance, and some actions to take to minimize the sources of error in count room operations. This course will prepare the student withmore » the skills necessary for radiological control technician (RCT) qualification by passing quizzes, tests, and the RCT Comprehensive Phase 1, Unit 2 Examination (TEST 27566) and by providing in the field skills.« less

  14. Weight Vector Fluctuations in Adaptive Antenna Arrays Tuned Using the Least-Mean-Square Error Algorithm with Quadratic Constraint

    NASA Astrophysics Data System (ADS)

    Zimina, S. V.

    2015-06-01

    We present the results of statistical analysis of an adaptive antenna array tuned using the least-mean-square error algorithm with quadratic constraint on the useful-signal amplification with allowance for the weight-coefficient fluctuations. Using the perturbation theory, the expressions for the correlation function and power of the output signal of the adaptive antenna array, as well as the formula for the weight-vector covariance matrix are obtained in the first approximation. The fluctuations are shown to lead to the signal distortions at the antenna-array output. The weight-coefficient fluctuations result in the appearance of additional terms in the statistical characteristics of the antenna array. It is also shown that the weight-vector fluctuations are isotropic, i.e., identical in all directions of the weight-coefficient space.

  15. Analytical Assessment of Simultaneous Parallel Approach Feasibility from Total System Error

    NASA Technical Reports Server (NTRS)

    Madden, Michael M.

    2014-01-01

    In a simultaneous paired approach to closely-spaced parallel runways, a pair of aircraft flies in close proximity on parallel approach paths. The aircraft pair must maintain a longitudinal separation within a range that avoids wake encounters and, if one of the aircraft blunders, avoids collision. Wake avoidance defines the rear gate of the longitudinal separation. The lead aircraft generates a wake vortex that, with the aid of crosswinds, can travel laterally onto the path of the trail aircraft. As runway separation decreases, the wake has less distance to traverse to reach the path of the trail aircraft. The total system error of each aircraft further reduces this distance. The total system error is often modeled as a probability distribution function. Therefore, Monte-Carlo simulations are a favored tool for assessing a "safe" rear-gate. However, safety for paired approaches typically requires that a catastrophic wake encounter be a rare one-in-a-billion event during normal operation. Using a Monte-Carlo simulation to assert this event rarity with confidence requires a massive number of runs. Such large runs do not lend themselves to rapid turn-around during the early stages of investigation when the goal is to eliminate the infeasible regions of the solution space and to perform trades among the independent variables in the operational concept. One can employ statistical analysis using simplified models more efficiently to narrow the solution space and identify promising trades for more in-depth investigation using Monte-Carlo simulations. These simple, analytical models not only have to address the uncertainty of the total system error but also the uncertainty in navigation sources used to alert an abort of the procedure. This paper presents a method for integrating total system error, procedure abort rates, avionics failures, and surveillance errors into a statistical analysis that identifies the likely feasible runway separations for simultaneous paired approaches.

  16. The statistical validity of nursing home survey findings.

    PubMed

    Woolley, Douglas C

    2011-11-01

    The Medicare nursing home survey is a high-stakes process whose findings greatly affect nursing homes, their current and potential residents, and the communities they serve. Therefore, survey findings must achieve high validity. This study looked at the validity of one key assessment made during a nursing home survey: the observation of the rate of errors in administration of medications to residents (med-pass). Statistical analysis of the case under study and of alternative hypothetical cases. A skilled nursing home affiliated with a local medical school. The nursing home administrators and the medical director. Observational study. The probability that state nursing home surveyors make a Type I or Type II error in observing med-pass error rates, based on the current case and on a series of postulated med-pass error rates. In the common situation such as our case, where med-pass errors occur at slightly above a 5% rate after 50 observations, and therefore trigger a citation, the chance that the true rate remains above 5% after a large number of observations is just above 50%. If the true med-pass error rate were as high as 10%, and the survey team wished to achieve 75% accuracy in determining that a citation was appropriate, they would have to make more than 200 med-pass observations. In the more common situation where med pass errors are closer to 5%, the team would have to observe more than 2000 med-passes to achieve even a modest 75% accuracy in their determinations. In settings where error rates are low, large numbers of observations of an activity must be made to reach acceptable validity of estimates for the true rates of errors. In observing key nursing home functions with current methodology, the State Medicare nursing home survey process does not adhere to well-known principles of valid error determination. Alternate approaches in survey methodology are discussed. Copyright © 2011 American Medical Directors Association. Published by Elsevier Inc. All rights reserved.

  17. Assessment of the knowledge and attitudes of intern doctors to medication prescribing errors in a Nigeria tertiary hospital.

    PubMed

    Ajemigbitse, Adetutu A; Omole, Moses Kayode; Ezike, Nnamdi Chika; Erhun, Wilson O

    2013-12-01

    Junior doctors are reported to make most of the prescribing errors in the hospital setting. The aim of the following study is to determine the knowledge intern doctors have about prescribing errors and circumstances contributing to making them. A structured questionnaire was distributed to intern doctors in National Hospital Abuja Nigeria. Respondents gave information about their experience with prescribing medicines, the extent to which they agreed with the definition of a clinically meaningful prescribing error and events that constituted such. Their experience with prescribing certain categories of medicines was also sought. Data was analyzed with Statistical Package for the Social Sciences (SPSS) software version 17 (SPSS Inc Chicago, Ill, USA). Chi-squared analysis contrasted differences in proportions; P < 0.05 was considered to be statistically significant. The response rate was 90.9% and 27 (90%) had <1 year of prescribing experience. 17 (56.7%) respondents totally agreed with the definition of a clinically meaningful prescribing error. Most common reasons for prescribing mistakes were a failure to check prescriptions with a reference source (14, 25.5%) and failure to check for adverse drug interactions (14, 25.5%). Omitting some essential information such as duration of therapy (13, 20%), patient age (14, 21.5%) and dosage errors (14, 21.5%) were the most common types of prescribing errors made. Respondents considered workload (23, 76.7%), multitasking (19, 63.3%), rushing (18, 60.0%) and tiredness/stress (16, 53.3%) as important factors contributing to prescribing errors. Interns were least confident prescribing antibiotics (12, 25.5%), opioid analgesics (12, 25.5%) cytotoxics (10, 21.3%) and antipsychotics (9, 19.1%) unsupervised. Respondents seemed to have a low awareness of making prescribing errors. Principles of rational prescribing and events that constitute prescribing errors should be taught in the practice setting.

  18. Geodetic positioning using a global positioning system of satellites

    NASA Technical Reports Server (NTRS)

    Fell, P. J.

    1980-01-01

    Geodetic positioning using range, integrated Doppler, and interferometric observations from a constellation of twenty-four Global Positioning System satellites is analyzed. A summary of the proposals for geodetic positioning and baseline determination is given which includes a description of measurement techniques and comments on rank deficiency and error sources. An analysis of variance comparison of range, Doppler, and interferometric time delay to determine their relative geometric strength for baseline determination is included. An analytic examination to the effect of a priori constraints on positioning using simultaneous observations from two stations is presented. Dynamic point positioning and baseline determination using range and Doppler is examined in detail. Models for the error sources influencing dynamic positioning are developed. Included is a discussion of atomic clock stability, and range and Doppler observation error statistics based on random correlated atomic clock error are derived.

  19. Estimation of trends

    NASA Technical Reports Server (NTRS)

    1981-01-01

    The application of statistical methods to recorded ozone measurements. The effects of a long term depletion of ozone at magnitudes predicted by the NAS is harmful to most forms of life. Empirical prewhitening filters the derivation of which is independent of the underlying physical mechanisms were analyzed. Statistical analysis performs a checks and balances effort. Time series filters variations into systematic and random parts, errors are uncorrelated, and significant phase lag dependencies are identified. The use of time series modeling to enhance the capability of detecting trends is discussed.

  20. The Simpson's paradox unraveled

    PubMed Central

    Hernán, Miguel A; Clayton, David; Keiding, Niels

    2011-01-01

    Background In a famous article, Simpson described a hypothetical data example that led to apparently paradoxical results. Methods We make the causal structure of Simpson's example explicit. Results We show how the paradox disappears when the statistical analysis is appropriately guided by subject-matter knowledge. We also review previous explanations of Simpson's paradox that attributed it to two distinct phenomena: confounding and non-collapsibility. Conclusion Analytical errors may occur when the problem is stripped of its causal context and analyzed merely in statistical terms. PMID:21454324

  1. Effect of the image resolution on the statistical descriptors of heterogeneous media.

    PubMed

    Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime

    2018-02-01

    The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.

  2. Effect of the image resolution on the statistical descriptors of heterogeneous media

    NASA Astrophysics Data System (ADS)

    Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime

    2018-02-01

    The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.

  3. General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

    PubMed Central

    Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

    2013-01-01

    We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515

  4. A critique of Rasch residual fit statistics.

    PubMed

    Karabatsos, G

    2000-01-01

    In test analysis involving the Rasch model, a large degree of importance is placed on the "objective" measurement of individual abilities and item difficulties. The degree to which the objectivity properties are attained, of course, depends on the degree to which the data fit the Rasch model. It is therefore important to utilize fit statistics that accurately and reliably detect the person-item response inconsistencies that threaten the measurement objectivity of persons and items. Given this argument, it is somewhat surprising that there is far more emphasis placed in the objective measurement of person and items than there is in the measurement quality of Rasch fit statistics. This paper provides a critical analysis of the residual fit statistics of the Rasch model, arguably the most often used fit statistics, in an effort to illustrate that the task of Rasch fit analysis is not as simple and straightforward as it appears to be. The faulty statistical properties of the residual fit statistics do not allow either a convenient or a straightforward approach to Rasch fit analysis. For instance, given a residual fit statistic, the use of a single minimum critical value for misfit diagnosis across different testing situations, where the situations vary in sample and test properties, leads to both the overdetection and underdetection of misfit. To improve this situation, it is argued that psychometricians need to implement residual-free Rasch fit statistics that are based on the number of Guttman response errors, or use indices that are statistically optimal in detecting measurement disturbances.

  5. Power analysis to detect treatment effect in longitudinal studies with heterogeneous errors and incomplete data.

    PubMed

    Vallejo, Guillermo; Ato, Manuel; Fernández García, Paula; Livacic Rojas, Pablo E; Tuero Herrero, Ellián

    2016-08-01

     S. Usami (2014) describes a method to realistically determine sample size in longitudinal research using a multilevel model. The present research extends the aforementioned work to situations where it is likely that the assumption of homogeneity of the errors across groups is not met and the error term does not follow a scaled identity covariance structure.   For this purpose, we followed a procedure based on transforming the variance components of the linear growth model and the parameter related to the treatment effect into specific and easily understandable indices. At the same time, we provide the appropriate statistical machinery for researchers to use when data loss is unavoidable, and changes in the expected value of the observed responses are not linear.   The empirical powers based on unknown variance components were virtually the same as the theoretical powers derived from the use of statistically processed indexes.   The main conclusion of the study is the accuracy of the proposed method to calculate sample size in the described situations with the stipulated power criteria.

  6. The distribution of refractive errors among children attending Lumbini Eye Institute, Nepal.

    PubMed

    Rai, S; Thapa, H B; Sharma, M K; Dhakhwa, K; Karki, R

    2012-01-01

    Uncorrected refractive error is an important cause of childhood blindness and visual impairment. To describe the patterns of refractive errors among children attending the outpatient clinic at the Department of Pediatric Ophthalmology, Lumbini Eye Institute, Bhairahawa, Nepal. Records of 133 children with refractive errors aged 5 - 15 years from both the urban and rural areas of Nepal and the adjacent territory of India attending the hospital between September and November 2010 were examined for patterns of refractive errors. The SPSS statistical software was used to perform data analysis. The commonest type of refractive error among the children was astigmatism (47 %) followed by myopia (34 %) and hyperopia (15 %). The refractive error was more prevalent among children of both the genders of age group 11-15 years as compared to their younger counterparts (RR = 1.22, 95 % CI = 0.66 - 2.25). The refractive error was more common (70 %) in the rural than the urban children (26 %). The rural females had a higher (38 %) prevalence of myopia than urban females (18 %). Among the children with refractive errors, only 57 % were using spectacles at the initial presentation. Astigmatism is the commonest type of refractive error among the children of age 5 - 15 years followed by hypermetropia and myopia. Refractive error remains uncorrected in a significant number of children. © NEPjOPH.

  7. Modular reweighting software for statistical mechanical analysis of biased equilibrium data

    NASA Astrophysics Data System (ADS)

    Sindhikara, Daniel J.

    2012-07-01

    Here a simple, useful, modular approach and software suite designed for statistical reweighting and analysis of equilibrium ensembles is presented. Statistical reweighting is useful and sometimes necessary for analysis of equilibrium enhanced sampling methods, such as umbrella sampling or replica exchange, and also in experimental cases where biasing factors are explicitly known. Essentially, statistical reweighting allows extrapolation of data from one or more equilibrium ensembles to another. Here, the fundamental separable steps of statistical reweighting are broken up into modules - allowing for application to the general case and avoiding the black-box nature of some “all-inclusive” reweighting programs. Additionally, the programs included are, by-design, written with little dependencies. The compilers required are either pre-installed on most systems, or freely available for download with minimal trouble. Examples of the use of this suite applied to umbrella sampling and replica exchange molecular dynamics simulations will be shown along with advice on how to apply it in the general case. New version program summaryProgram title: Modular reweighting version 2 Catalogue identifier: AEJH_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJH_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 179 118 No. of bytes in distributed program, including test data, etc.: 8 518 178 Distribution format: tar.gz Programming language: C++, Python 2.6+, Perl 5+ Computer: Any Operating system: Any RAM: 50-500 MB Supplementary material: An updated version of the original manuscript (Comput. Phys. Commun. 182 (2011) 2227) is available Classification: 4.13 Catalogue identifier of previous version: AEJH_v1_0 Journal reference of previous version: Comput. Phys. Commun. 182 (2011) 2227 Does the new version supersede the previous version?: Yes Nature of problem: While equilibrium reweighting is ubiquitous, there are no public programs available to perform the reweighting in the general case. Further, specific programs often suffer from many library dependencies and numerical instability. Solution method: This package is written in a modular format that allows for easy applicability of reweighting in the general case. Modules are small, numerically stable, and require minimal libraries. Reasons for new version: Some minor bugs, some upgrades needed, error analysis added. analyzeweight.py/analyzeweight.py2 has been replaced by “multihist.py”. This new program performs all the functions of its predecessor while being versatile enough to handle other types of histograms and probability analysis. “bootstrap.py” was added. This script performs basic bootstrap resampling allowing for error analysis of data. “avg_dev_distribution.py” was added. This program computes the averages and standard deviations of multiple distributions, making error analysis (e.g. from bootstrap resampling) easier to visualize. WRE.cpp was slightly modified purely for cosmetic reasons. The manual was updated for clarity and to reflect version updates. Examples were removed from the manual in favor of online tutorials (packaged examples remain). Examples were updated to reflect the new format. An additional example is included to demonstrate error analysis. Running time: Preprocessing scripts 1-5 minutes, WHAM engine <1 minute, postprocess script ∼1-5 minutes.

  8. A systematic review of the quality of statistical methods employed for analysing quality of life data in cancer randomised controlled trials.

    PubMed

    Hamel, Jean-Francois; Saulnier, Patrick; Pe, Madeline; Zikos, Efstathios; Musoro, Jammbe; Coens, Corneel; Bottomley, Andrew

    2017-09-01

    Over the last decades, Health-related Quality of Life (HRQoL) end-points have become an important outcome of the randomised controlled trials (RCTs). HRQoL methodology in RCTs has improved following international consensus recommendations. However, no international recommendations exist concerning the statistical analysis of such data. The aim of our study was to identify and characterise the quality of the statistical methods commonly used for analysing HRQoL data in cancer RCTs. Building on our recently published systematic review, we analysed a total of 33 published RCTs studying the HRQoL methods reported in RCTs since 1991. We focussed on the ability of the methods to deal with the three major problems commonly encountered when analysing HRQoL data: their multidimensional and longitudinal structure and the commonly high rate of missing data. All studies reported HRQoL being assessed repeatedly over time for a period ranging from 2 to 36 months. Missing data were common, with compliance rates ranging from 45% to 90%. From the 33 studies considered, 12 different statistical methods were identified. Twenty-nine studies analysed each of the questionnaire sub-dimensions without type I error adjustment. Thirteen studies repeated the HRQoL analysis at each assessment time again without type I error adjustment. Only 8 studies used methods suitable for repeated measurements. Our findings show a lack of consistency in statistical methods for analysing HRQoL data. Problems related to multiple comparisons were rarely considered leading to a high risk of false positive results. It is therefore critical that international recommendations for improving such statistical practices are developed. Copyright © 2017. Published by Elsevier Ltd.

  9. Pathway analysis with next-generation sequencing data.

    PubMed

    Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

    2015-04-01

    Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

  10. Statistical variation in progressive scrambling

    NASA Astrophysics Data System (ADS)

    Clark, Robert D.; Fox, Peter C.

    2004-07-01

    The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.

  11. Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

    PubMed Central

    Williams, L. Keoki; Buu, Anne

    2017-01-01

    We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206

  12. Enhancement of CFD validation exercise along the roof profile of a low-rise building

    NASA Astrophysics Data System (ADS)

    Deraman, S. N. C.; Majid, T. A.; Zaini, S. S.; Yahya, W. N. W.; Abdullah, J.; Ismail, M. A.

    2018-04-01

    The aim of this study is to enhance the validation of CFD exercise along the roof profile of a low-rise building. An isolated gabled-roof house having 26.6° roof pitch was simulated to obtain the pressure coefficient around the house. Validation of CFD analysis with experimental data requires many input parameters. This study performed CFD simulation based on the data from a previous study. Where the input parameters were not clearly stated, new input parameters were established from the open literatures. The numerical simulations were performed in FLUENT 14.0 by applying the Computational Fluid Dynamics (CFD) approach based on steady RANS equation together with RNG k-ɛ model. Hence, the result from CFD was analysed by using quantitative test (statistical analysis) and compared with CFD results from the previous study. The statistical analysis results from ANOVA test and error measure showed that the CFD results from the current study produced good agreement and exhibited the closest error compared to the previous study. All the input data used in this study can be extended to other types of CFD simulation involving wind flow over an isolated single storey house.

  13. A Complementary Note to 'A Lag-1 Smoother Approach to System-Error Estimation': The Intrinsic Limitations of Residual Diagnostics

    NASA Technical Reports Server (NTRS)

    Todling, Ricardo

    2015-01-01

    Recently, this author studied an approach to the estimation of system error based on combining observation residuals derived from a sequential filter and fixed lag-1 smoother. While extending the methodology to a variational formulation, experimenting with simple models and making sure consistency was found between the sequential and variational formulations, the limitations of the residual-based approach came clearly to the surface. This note uses the sequential assimilation application to simple nonlinear dynamics to highlight the issue. Only when some of the underlying error statistics are assumed known is it possible to estimate the unknown component. In general, when considerable uncertainties exist in the underlying statistics as a whole, attempts to obtain separate estimates of the various error covariances are bound to lead to misrepresentation of errors. The conclusions are particularly relevant to present-day attempts to estimate observation-error correlations from observation residual statistics. A brief illustration of the issue is also provided by comparing estimates of error correlations derived from a quasi-operational assimilation system and a corresponding Observing System Simulation Experiments framework.

  14. Spatial interpolation of solar global radiation

    NASA Astrophysics Data System (ADS)

    Lussana, C.; Uboldi, F.; Antoniazzi, C.

    2010-09-01

    Solar global radiation is defined as the radiant flux incident onto an area element of the terrestrial surface. Its direct knowledge plays a crucial role in many applications, from agrometeorology to environmental meteorology. The ARPA Lombardia's meteorological network includes about one hundred of pyranometers, mostly distributed in the southern part of the Alps and in the centre of the Po Plain. A statistical interpolation method based on an implementation of the Optimal Interpolation is applied to the hourly average of the solar global radiation observations measured by the ARPA Lombardia's network. The background field is obtained using SMARTS (The Simple Model of the Atmospheric Radiative Transfer of Sunshine, Gueymard, 2001). The model is initialised by assuming clear sky conditions and it takes into account the solar position and orography related effects (shade and reflection). The interpolation of pyranometric observations introduces in the analysis fields information about cloud presence and influence. A particular effort is devoted to prevent observations affected by large errors of different kinds (representativity errors, systematic errors, gross errors) from entering the analysis procedure. The inclusion of direct cloud information from satellite observations is also planned.

  15. Deriving Color-Color Transformations for VRI Photometry

    NASA Astrophysics Data System (ADS)

    Taylor, B. J.; Joner, M. D.

    2006-12-01

    In this paper, transformations between Cousins R-I and other indices are considered. New transformations to Cousins V-R and Johnson V-K are derived, a published transformation involving T1-T2 on the Washington system is rederived, and the basis for a transformation involving b-y is considered. In addition, a statistically rigorous procedure for deriving such transformations is presented and discussed in detail. Highlights of the discussion include (1) the need for statistical analysis when least-squares relations are determined and interpreted, (2) the permitted forms and best forms for such relations, (3) the essential role played by accidental errors, (4) the decision process for selecting terms to appear in the relations, (5) the use of plots of residuals, (6) detection of influential data, (7) a protocol for assessing systematic effects from absorption features and other sources, (8) the reasons for avoiding extrapolation of the relations, (9) a protocol for ensuring uniformity in data used to determine the relations, and (10) the derivation and testing of the accidental errors of those data. To put the last of these subjects in perspective, it is shown that rms errors for VRI photometry have been as small as 6 mmag for more than three decades and that standard errors for quantities derived from such photometry can be as small as 1 mmag or less.

  16. Multiple imputation of missing fMRI data in whole brain analysis

    PubMed Central

    Vaden, Kenneth I.; Gebregziabher, Mulugeta; Kuchinsky, Stefanie E.; Eckert, Mark A.

    2012-01-01

    Whole brain fMRI analyses rarely include the entire brain because of missing data that result from data acquisition limits and susceptibility artifact, in particular. This missing data problem is typically addressed by omitting voxels from analysis, which may exclude brain regions that are of theoretical interest and increase the potential for Type II error at cortical boundaries or Type I error when spatial thresholds are used to establish significance. Imputation could significantly expand statistical map coverage, increase power, and enhance interpretations of fMRI results. We examined multiple imputation for group level analyses of missing fMRI data using methods that leverage the spatial information in fMRI datasets for both real and simulated data. Available case analysis, neighbor replacement, and regression based imputation approaches were compared in a general linear model framework to determine the extent to which these methods quantitatively (effect size) and qualitatively (spatial coverage) increased the sensitivity of group analyses. In both real and simulated data analysis, multiple imputation provided 1) variance that was most similar to estimates for voxels with no missing data, 2) fewer false positive errors in comparison to mean replacement, and 3) fewer false negative errors in comparison to available case analysis. Compared to the standard analysis approach of omitting voxels with missing data, imputation methods increased brain coverage in this study by 35% (from 33,323 to 45,071 voxels). In addition, multiple imputation increased the size of significant clusters by 58% and number of significant clusters across statistical thresholds, compared to the standard voxel omission approach. While neighbor replacement produced similar results, we recommend multiple imputation because it uses an informed sampling distribution to deal with missing data across subjects that can include neighbor values and other predictors. Multiple imputation is anticipated to be particularly useful for 1) large fMRI data sets with inconsistent missing voxels across subjects and 2) addressing the problem of increased artifact at ultra-high field, which significantly limit the extent of whole brain coverage and interpretations of results. PMID:22500925

  17. Challenges of Big Data Analysis.

    PubMed

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  18. Challenges of Big Data Analysis

    PubMed Central

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-01-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469

  19. Dealing with dietary measurement error in nutritional cohort studies.

    PubMed

    Freedman, Laurence S; Schatzkin, Arthur; Midthune, Douglas; Kipnis, Victor

    2011-07-20

    Dietary measurement error creates serious challenges to reliably discovering new diet-disease associations in nutritional cohort studies. Such error causes substantial underestimation of relative risks and reduction of statistical power for detecting associations. On the basis of data from the Observing Protein and Energy Nutrition Study, we recommend the following approaches to deal with these problems. Regarding data analysis of cohort studies using food-frequency questionnaires, we recommend 1) using energy adjustment for relative risk estimation; 2) reporting estimates adjusted for measurement error along with the usual relative risk estimates, whenever possible (this requires data from a relevant, preferably internal, validation study in which participants report intakes using both the main instrument and a more detailed reference instrument such as a 24-hour recall or multiple-day food record); 3) performing statistical adjustment of relative risks, based on such validation data, if they exist, using univariate (only for energy-adjusted intakes such as densities or residuals) or multivariate regression calibration. We note that whereas unadjusted relative risk estimates are biased toward the null value, statistical significance tests of unadjusted relative risk estimates are approximately valid. Regarding study design, we recommend increasing the sample size to remedy loss of power; however, it is important to understand that this will often be an incomplete solution because the attenuated signal may be too small to distinguish from unmeasured confounding in the model relating disease to reported intake. Future work should be devoted to alleviating the problem of signal attenuation, possibly through the use of improved self-report instruments or by combining dietary biomarkers with self-report instruments.

  20. Detection of presence of chemical precursors

    NASA Technical Reports Server (NTRS)

    Li, Jing (Inventor); Meyyappan, Meyya (Inventor); Lu, Yijiang (Inventor)

    2009-01-01

    Methods and systems for determining if one or more target molecules are present in a gas, by exposing a functionalized carbon nanostructure (CNS) to the gas and measuring an electrical parameter value EPV(n) associated with each of N CNS sub-arrays. In a first embodiment, a most-probable concentration value C(opt) is estimated, and an error value, depending upon differences between the measured values EPV(n) and corresponding values EPV(n;C(opt)) is computed. If the error value is less than a first error threshold value, the system interprets this as indicating that the target molecule is present in a concentration C.apprxeq.C(opt). A second embodiment uses extensive statistical and vector space analysis to estimate target molecule concentration.

  1. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  2. Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses

    NASA Astrophysics Data System (ADS)

    Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong

    2017-04-01

    Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums shows a mean improvement of more than 40% in CRPS when compared to bilinearly interpolated uncalibrated ensemble forecasts. The validation on randomly selected grid points, representing the true height distribution over Austria, still indicates a mean improvement of 35%. The applied statistical model is currently set up for 6-hourly and daily accumulation periods, but will be extended to a temporal resolution of 1-3 hours within a new probabilistic nowcasting system operated by ZAMG.

  3. Preanalytical errors in medical laboratories: a review of the available methodologies of data collection and analysis.

    PubMed

    West, Jamie; Atherton, Jennifer; Costelloe, Seán J; Pourmahram, Ghazaleh; Stretton, Adam; Cornes, Michael

    2017-01-01

    Preanalytical errors have previously been shown to contribute a significant proportion of errors in laboratory processes and contribute to a number of patient safety risks. Accreditation against ISO 15189:2012 requires that laboratory Quality Management Systems consider the impact of preanalytical processes in areas such as the identification and control of non-conformances, continual improvement, internal audit and quality indicators. Previous studies have shown that there is a wide variation in the definition, repertoire and collection methods for preanalytical quality indicators. The International Federation of Clinical Chemistry Working Group on Laboratory Errors and Patient Safety has defined a number of quality indicators for the preanalytical stage, and the adoption of harmonized definitions will support interlaboratory comparisons and continual improvement. There are a variety of data collection methods, including audit, manual recording processes, incident reporting mechanisms and laboratory information systems. Quality management processes such as benchmarking, statistical process control, Pareto analysis and failure mode and effect analysis can be used to review data and should be incorporated into clinical governance mechanisms. In this paper, The Association for Clinical Biochemistry and Laboratory Medicine PreAnalytical Specialist Interest Group review the various data collection methods available. Our recommendation is the use of the laboratory information management systems as a recording mechanism for preanalytical errors as this provides the easiest and most standardized mechanism of data capture.

  4. At least some errors are randomly generated (Freud was wrong)

    NASA Technical Reports Server (NTRS)

    Sellen, A. J.; Senders, J. W.

    1986-01-01

    An experiment was carried out to expose something about human error generating mechanisms. In the context of the experiment, an error was made when a subject pressed the wrong key on a computer keyboard or pressed no key at all in the time allotted. These might be considered, respectively, errors of substitution and errors of omission. Each of seven subjects saw a sequence of three digital numbers, made an easily learned binary judgement about each, and was to press the appropriate one of two keys. Each session consisted of 1,000 presentations of randomly permuted, fixed numbers broken into 10 blocks of 100. One of two keys should have been pressed within one second of the onset of each stimulus. These data were subjected to statistical analyses in order to probe the nature of the error generating mechanisms. Goodness of fit tests for a Poisson distribution for the number of errors per 50 trial interval and for an exponential distribution of the length of the intervals between errors were carried out. There is evidence for an endogenous mechanism that may best be described as a random error generator. Furthermore, an item analysis of the number of errors produced per stimulus suggests the existence of a second mechanism operating on task driven factors producing exogenous errors. Some errors, at least, are the result of constant probability generating mechanisms with error rate idiosyncratically determined for each subject.

  5. Progressive statistics for studies in sports medicine and exercise science.

    PubMed

    Hopkins, William G; Marshall, Stephen W; Batterham, Alan M; Hanin, Juri

    2009-01-01

    Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.

  6. Statistical analysis of the 70 meter antenna surface distortions

    NASA Technical Reports Server (NTRS)

    Kiedron, K.; Chian, C. T.; Chuang, K. L.

    1987-01-01

    Statistical analysis of surface distortions of the 70 meter NASA/JPL antenna, located at Goldstone, was performed. The purpose of this analysis is to verify whether deviations due to gravity loading can be treated as quasi-random variables with normal distribution. Histograms of the RF pathlength error distribution for several antenna elevation positions were generated. The results indicate that the deviations from the ideal antenna surface are not normally distributed. The observed density distribution for all antenna elevation angles is taller and narrower than the normal density, which results in large positive values of kurtosis and a significant amount of skewness. The skewness of the distribution changes from positive to negative as the antenna elevation changes from zenith to horizon.

  7. Comparative analysis on the selection of number of clusters in community detection

    NASA Astrophysics Data System (ADS)

    Kawamoto, Tatsuro; Kabashima, Yoshiyuki

    2018-02-01

    We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.

  8. SU-E-T-503: IMRT Optimization Using Monte Carlo Dose Engine: The Effect of Statistical Uncertainty.

    PubMed

    Tian, Z; Jia, X; Graves, Y; Uribe-Sanchez, A; Jiang, S

    2012-06-01

    With the development of ultra-fast GPU-based Monte Carlo (MC) dose engine, it becomes clinically realistic to compute the dose-deposition coefficients (DDC) for IMRT optimization using MC simulation. However, it is still time-consuming if we want to compute DDC with small statistical uncertainty. This work studies the effects of the statistical error in DDC matrix on IMRT optimization. The MC-computed DDC matrices are simulated here by adding statistical uncertainties at a desired level to the ones generated with a finite-size pencil beam algorithm. A statistical uncertainty model for MC dose calculation is employed. We adopt a penalty-based quadratic optimization model and gradient descent method to optimize fluence map and then recalculate the corresponding actual dose distribution using the noise-free DDC matrix. The impacts of DDC noise are assessed in terms of the deviation of the resulted dose distributions. We have also used a stochastic perturbation theory to theoretically estimate the statistical errors of dose distributions on a simplified optimization model. A head-and-neck case is used to investigate the perturbation to IMRT plan due to MC's statistical uncertainty. The relative errors of the final dose distributions of the optimized IMRT are found to be much smaller than those in the DDC matrix, which is consistent with our theoretical estimation. When history number is decreased from 108 to 106, the dose-volume-histograms are still very similar to the error-free DVHs while the error in DDC is about 3.8%. The results illustrate that the statistical errors in the DDC matrix have a relatively small effect on IMRT optimization in dose domain. This indicates we can use relatively small number of histories to obtain the DDC matrix with MC simulation within a reasonable amount of time, without considerably compromising the accuracy of the optimized treatment plan. This work is supported by Varian Medical Systems through a Master Research Agreement. © 2012 American Association of Physicists in Medicine.

  9. Forest statistics for Northwest Florida, 1987

    Treesearch

    Mark J. Brown

    1987-01-01

    The Forest Inventory and Analysis (Forest Survey) Research Work Unit at the Southeastern Forest Experiment Station recently conducted a review of its data processing procedures. During this process, a computer error was discovered which led to inflated estimates of annual removals, net annual growth, and annual mortality for the 1970-1980 remeasurement period in...

  10. Explanation of Two Anomalous Results in Statistical Mediation Analysis

    ERIC Educational Resources Information Center

    Fritz, Matthew S.; Taylor, Aaron B.; MacKinnon, David P.

    2012-01-01

    Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special…

  11. The Length of a Pestle: A Class Exercise in Measurement and Statistical Analysis.

    ERIC Educational Resources Information Center

    O'Reilly, James E.

    1986-01-01

    Outlines the simple exercise of measuring the length of an object as a concrete paradigm of the entire process of making chemical measurements and treating the resulting data. Discusses the procedure, significant figures, measurement error, spurious data, rejection of results, precision and accuracy, and student responses. (TW)

  12. More Powerful Tests of Simple Interaction Contrasts in the Two-Way Factorial Design

    ERIC Educational Resources Information Center

    Hancock, Gregory R.; McNeish, Daniel M.

    2017-01-01

    For the two-way factorial design in analysis of variance, the current article explicates and compares three methods for controlling the Type I error rate for all possible simple interaction contrasts following a statistically significant interaction, including a proposed modification to the Bonferroni procedure that increases the power of…

  13. Uncertainty Analysis of Instrument Calibration and Application

    NASA Technical Reports Server (NTRS)

    Tripp, John S.; Tcheng, Ping

    1999-01-01

    Experimental aerodynamic researchers require estimated precision and bias uncertainties of measured physical quantities, typically at 95 percent confidence levels. Uncertainties of final computed aerodynamic parameters are obtained by propagation of individual measurement uncertainties through the defining functional expressions. In this paper, rigorous mathematical techniques are extended to determine precision and bias uncertainties of any instrument-sensor system. Through this analysis, instrument uncertainties determined through calibration are now expressed as functions of the corresponding measurement for linear and nonlinear univariate and multivariate processes. Treatment of correlated measurement precision error is developed. During laboratory calibration, calibration standard uncertainties are assumed to be an order of magnitude less than those of the instrument being calibrated. Often calibration standards do not satisfy this assumption. This paper applies rigorous statistical methods for inclusion of calibration standard uncertainty and covariance due to the order of their application. The effects of mathematical modeling error on calibration bias uncertainty are quantified. The effects of experimental design on uncertainty are analyzed. The importance of replication is emphasized, techniques for estimation of both bias and precision uncertainties using replication are developed. Statistical tests for stationarity of calibration parameters over time are obtained.

  14. Multi-template tensor-based morphometry: Application to analysis of Alzheimer's disease

    PubMed Central

    Koikkalainen, Juha; Lötjönen, Jyrki; Thurfjell, Lennart; Rueckert, Daniel; Waldemar, Gunhild; Soininen, Hilkka

    2012-01-01

    In this paper methods for using multiple templates in tensor-based morphometry (TBM) are presented and comparedtothe conventional single-template approach. TBM analysis requires non-rigid registrations which are often subject to registration errors. When using multiple templates and, therefore, multiple registrations, it can be assumed that the registration errors are averaged and eventually compensated. Four different methods are proposed for multi-template TBM. The methods were evaluated using magnetic resonance (MR) images of healthy controls, patients with stable or progressive mild cognitive impairment (MCI), and patients with Alzheimer's disease (AD) from the ADNI database (N=772). The performance of TBM features in classifying images was evaluated both quantitatively and qualitatively. Classification results show that the multi-template methods are statistically significantly better than the single-template method. The overall classification accuracy was 86.0% for the classification of control and AD subjects, and 72.1%for the classification of stable and progressive MCI subjects. The statistical group-level difference maps produced using multi-template TBM were smoother, formed larger continuous regions, and had larger t-values than the maps obtained with single-template TBM. PMID:21419228

  15. Statistics of the radiated field of a space-to-earth microwave power transfer system

    NASA Technical Reports Server (NTRS)

    Stevens, G. H.; Leininger, G.

    1976-01-01

    Statistics such as average power density pattern, variance of the power density pattern and variance of the beam pointing error are related to hardware parameters such as transmitter rms phase error and rms amplitude error. Also a limitation on spectral width of the phase reference for phase control was established. A 1 km diameter transmitter appears feasible provided the total rms insertion phase errors of the phase control modules does not exceed 10 deg, amplitude errors do not exceed 10% rms, and the phase reference spectral width does not exceed approximately 3 kHz. With these conditions the expected radiation pattern is virtually the same as the error free pattern, and the rms beam pointing error would be insignificant (approximately 10 meters).

  16. Predictors of Errors of Novice Java Programmers

    ERIC Educational Resources Information Center

    Bringula, Rex P.; Manabat, Geecee Maybelline A.; Tolentino, Miguel Angelo A.; Torres, Edmon L.

    2012-01-01

    This descriptive study determined which of the sources of errors would predict the errors committed by novice Java programmers. Descriptive statistics revealed that the respondents perceived that they committed the identified eighteen errors infrequently. Thought error was perceived to be the main source of error during the laboratory programming…

  17. A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data

    PubMed Central

    Zhu, Yun; Fan, Ruzong; Xiong, Momiao

    2017-01-01

    Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics. PMID:29040274

  18. An Artificial Intelligence Approach to Analyzing Student Errors in Statistics.

    ERIC Educational Resources Information Center

    Sebrechts, Marc M.; Schooler, Lael J.

    1987-01-01

    Describes the development of an artificial intelligence system called GIDE that analyzes student errors in statistics problems by inferring the students' intentions. Learning strategies involved in problem solving are discussed and the inclusion of goal structures is explained. (LRW)

  19. Statistical error model for a solar electric propulsion thrust subsystem

    NASA Technical Reports Server (NTRS)

    Bantell, M. H.

    1973-01-01

    The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.

  20. Empirical investigation into depth-resolution of Magnetotelluric data

    NASA Astrophysics Data System (ADS)

    Piana Agostinetti, N.; Ogaya, X.

    2017-12-01

    We investigate the depth-resolution of MT data comparing reconstructed 1D resistivity profiles with measured resistivity and lithostratigraphy from borehole data. Inversion of MT data has been widely used to reconstruct the 1D fine-layered resistivity structure beneath an isolated Magnetotelluric (MT) station. Uncorrelated noise is generally assumed to be associated to MT data. However, wrong assumptions on error statistics have been proved to strongly bias the results obtained in geophysical inversions. In particular the number of resolved layers at depth strongly depends on error statistics. In this study, we applied a trans-dimensional McMC algorithm for reconstructing the 1D resistivity profile near-by the location of a 1500 m-deep borehole, using MT data. We resolve the MT inverse problem imposing different models for the error statistics associated to the MT data. Following a Hierachical Bayes' approach, we also inverted for the hyper-parameters associated to each error statistics model. Preliminary results indicate that assuming un-correlated noise leads to a number of resolved layers larger than expected from the retrieved lithostratigraphy. Moreover, comparing the inversion of synthetic resistivity data obtained from the "true" resistivity stratification measured along the borehole shows that a consistent number of resistivity layers can be obtained using a Gaussian model for the error statistics, with substantial correlation length.

  1. Automatic identification of bacterial types using statistical imaging methods

    NASA Astrophysics Data System (ADS)

    Trattner, Sigal; Greenspan, Hayit; Tepper, Gapi; Abboud, Shimon

    2003-05-01

    The objective of the current study is to develop an automatic tool to identify bacterial types using computer-vision and statistical modeling techniques. Bacteriophage (phage)-typing methods are used to identify and extract representative profiles of bacterial types, such as the Staphylococcus Aureus. Current systems rely on the subjective reading of plaque profiles by human expert. This process is time-consuming and prone to errors, especially as technology is enabling the increase in the number of phages used for typing. The statistical methodology presented in this work, provides for an automated, objective and robust analysis of visual data, along with the ability to cope with increasing data volumes.

  2. The Role of Margin in Link Design and Optimization

    NASA Technical Reports Server (NTRS)

    Cheung, K.

    2015-01-01

    Link analysis is a system engineering process in the design, development, and operation of communication systems and networks. Link models that are mathematical abstractions representing the useful signal power and the undesirable noise and attenuation effects (including weather effects if the signal path transverses through the atmosphere) that are integrated into the link budget calculation that provides the estimates of signal power and noise power at the receiver. Then the link margin is applied which attempts to counteract the fluctuations of the signal and noise power to ensure reliable data delivery from transmitter to receiver. (Link margin is dictated by the link margin policy or requirements.) A simple link budgeting approach assumes link parameters to be deterministic values typically adopted a rule-of-thumb policy of 3 dB link margin. This policy works for most S- and X-band links due to their insensitivity to weather effects. But for higher frequency links like Ka-band, Ku-band, and optical communication links, it is unclear if a 3 dB link margin would guarantee link closure. Statistical link analysis that adopted the 2-sigma or 3-sigma link margin incorporates link uncertainties in the sigma calculation. (The Deep Space Network (DSN) link margin policies are 2-sigma for downlink and 3-sigma for uplink.) The link reliability can therefore be quantified statistically even for higher frequency links. However in the current statistical link analysis approach, link reliability is only expressed as the likelihood of exceeding the signal-to-noise ratio (SNR) threshold that corresponds to a given bit-error-rate (BER) or frame-error-rate (FER) requirement. The method does not provide the true BER or FER estimate of the link with margin, or the required signalto-noise ratio (SNR) that would meet the BER or FER requirement in the statistical sense. In this paper, we perform in-depth analysis on the relationship between BER/FER requirement, operating SNR, and coding performance curve, in the case when the channel coherence time of link fluctuation is comparable or larger than the time duration of a codeword. We compute the "true" SNR design point that would meet the BER/FER requirement by taking into account the fluctuation of signal power and noise power at the receiver, and the shape of the coding performance curve. This analysis yields a number of valuable insights on the design choices of coding scheme and link margin for the reliable data delivery of a communication system - space and ground. We illustrate the aforementioned analysis using a number of standard NASA error-correcting codes.

  3. gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2017-05-01

    Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.

  4. An empirical model for estimating solar radiation in the Algerian Sahara

    NASA Astrophysics Data System (ADS)

    Benatiallah, Djelloul; Benatiallah, Ali; Bouchouicha, Kada; Hamouda, Messaoud; Nasri, Bahous

    2018-05-01

    The present work aims to determine the empirical model R.sun that will allow us to evaluate the solar radiation flues on a horizontal plane and in clear-sky on the located Adrar city (27°18 N and 0°11 W) of Algeria and compare with the results measured at the localized site. The expected results of this comparison are of importance for the investment study of solar systems (solar power plants for electricity production, CSP) and also for the design and performance analysis of any system using the solar energy. Statistical indicators used to evaluate the accuracy of the model where the mean bias error (MBE), root mean square error (RMSE) and coefficient of determination. The results show that for global radiation, the daily correlation coefficient is 0.9984. The mean absolute percentage error is 9.44 %. The daily mean bias error is -7.94 %. The daily root mean square error is 12.31 %.

  5. Global Warming Estimation from MSU

    NASA Technical Reports Server (NTRS)

    Prabhakara, C.; Iacovazzi, Robert, Jr.

    1999-01-01

    In this study, we have developed time series of global temperature from 1980-97 based on the Microwave Sounding Unit (MSU) Ch 2 (53.74 GHz) observations taken from polar-orbiting NOAA operational satellites. In order to create these time series, systematic errors (approx. 0.1 K) in the Ch 2 data arising from inter-satellite differences are removed objectively. On the other hand, smaller systematic errors (approx. 0.03 K) in the data due to orbital drift of each satellite cannot be removed objectively. Such errors are expected to remain in the time series and leave an uncertainty in the inferred global temperature trend. With the help of a statistical method, the error in the MSU inferred global temperature trend resulting from orbital drifts and residual inter-satellite differences of all satellites is estimated to be 0.06 K decade. Incorporating this error, our analysis shows that the global temperature increased at a rate of 0.13 +/- 0.06 K decade during 1980-97.

  6. Influence of ultraviolet irradiation on data retention characteristics in resistive random access memory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kimura, K.; Ohmi, K.; Tottori University Electronic Display Research Center, 101 Minami4-chome, Koyama-cho, Tottori-shi, Tottori 680-8551

    With increasing density of memory devices, the issue of generating soft errors by cosmic rays is becoming more and more serious. Therefore, the irradiation resistance of resistance random access memory (ReRAM) to cosmic radiation has to be elucidated for practical use. In this paper, we investigated the data retention characteristics of ReRAM against ultraviolet irradiation with a Pt/NiO/ITO structure. Soft errors were confirmed to be caused by ultraviolet irradiation in both low- and high-resistance states. An analysis of the wavelength dependence of light irradiation on data retention characteristics suggested that electronic excitation from the valence to the conduction band andmore » to the energy level generated due to the introduction of oxygen vacancies caused the errors. Based on a statistically estimated soft error rates, the errors were suggested to be caused by the cohesion and dispersion of oxygen vacancies owing to the generation of electron-hole pairs and valence changes by the ultraviolet irradiation.« less

  7. Multiple regression and Artificial Neural Network for long-term rainfall forecasting using large scale climate modes

    NASA Astrophysics Data System (ADS)

    Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.

    2013-10-01

    In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

  8. Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions

    ERIC Educational Resources Information Center

    Yormaz, Seha; Sünbül, Önder

    2017-01-01

    This study aims to determine the Type I error rates and power of S[subscript 1] , S[subscript 2] indices and kappa statistic at detecting copying on multiple-choice tests under various conditions. It also aims to determine how copying groups are created in order to calculate how kappa statistics affect Type I error rates and power. In this study,…

  9. An analytic technique for statistically modeling random atomic clock errors in estimation

    NASA Technical Reports Server (NTRS)

    Fell, P. J.

    1981-01-01

    Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.

  10. Statistical theory and methodology for remote sensing data analysis

    NASA Technical Reports Server (NTRS)

    Odell, P. L.

    1974-01-01

    A model is developed for the evaluation of acreages (proportions) of different crop-types over a geographical area using a classification approach and methods for estimating the crop acreages are given. In estimating the acreages of a specific croptype such as wheat, it is suggested to treat the problem as a two-crop problem: wheat vs. nonwheat, since this simplifies the estimation problem considerably. The error analysis and the sample size problem is investigated for the two-crop approach. Certain numerical results for sample sizes are given for a JSC-ERTS-1 data example on wheat identification performance in Hill County, Montana and Burke County, North Dakota. Lastly, for a large area crop acreages inventory a sampling scheme is suggested for acquiring sample data and the problem of crop acreage estimation and the error analysis is discussed.

  11. Calculating the free energy of transfer of small solutes into a model lipid membrane: Comparison between metadynamics and umbrella sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bochicchio, Davide; Panizon, Emanuele; Ferrando, Riccardo

    2015-10-14

    We compare the performance of two well-established computational algorithms for the calculation of free-energy landscapes of biomolecular systems, umbrella sampling and metadynamics. We look at benchmark systems composed of polyethylene and polypropylene oligomers interacting with lipid (phosphatidylcholine) membranes, aiming at the calculation of the oligomer water-membrane free energy of transfer. We model our test systems at two different levels of description, united-atom and coarse-grained. We provide optimized parameters for the two methods at both resolutions. We devote special attention to the analysis of statistical errors in the two different methods and propose a general procedure for the error estimation inmore » metadynamics simulations. Metadynamics and umbrella sampling yield the same estimates for the water-membrane free energy profile, but metadynamics can be more efficient, providing lower statistical uncertainties within the same simulation time.« less

  12. Atlas of the Light Curves and Phase Plane Portraits of Selected Long-Period Variables

    NASA Astrophysics Data System (ADS)

    Kudashkina, L. S.; Andronov, I. L.

    2017-12-01

    For a group of the Mira-type stars, semi-regular variables and some RV Tau - type stars the limit cycles were computed and plotted using the phase plane diagrams. As generalized coordinates x and x', we have used φ - the brightness of the star and its phase derivative. We have used mean phase light curves using observations of various authors from the databases of AAVSO, AFOEV, VSOLJ, ASAS and approximated using a trigonometric polynomial of statistically optimal degree. For a simple sine-like light curve, the limit cycle is a simple ellipse. In a case of more complicated light curve, in which harmonics are statistically significant, the limit cycle has deviations from the ellipse. In an addition to a classical analysis, we use the error estimates of the smoothing function and its derivative to constrain an "error corridor" in the phase plane.

  13. Comparative study of anatomical normalization errors in SPM and 3D-SSP using digital brain phantom.

    PubMed

    Onishi, Hideo; Matsutake, Yuki; Kawashima, Hiroki; Matsutomo, Norikazu; Amijima, Hizuru

    2011-01-01

    In single photon emission computed tomography (SPECT) cerebral blood flow studies, two major algorithms are widely used statistical parametric mapping (SPM) and three-dimensional stereotactic surface projections (3D-SSP). The aim of this study is to compare an SPM algorithm-based easy Z score imaging system (eZIS) and a 3D-SSP system in the errors of anatomical standardization using 3D-digital brain phantom images. We developed a 3D-brain digital phantom based on MR images to simulate the effects of head tilt, perfusion defective region size, and count value reduction rate on the SPECT images. This digital phantom was used to compare the errors of anatomical standardization by the eZIS and the 3D-SSP algorithms. While the eZIS allowed accurate standardization of the images of the phantom simulating a head in rotation, lateroflexion, anteflexion, or retroflexion without angle dependency, the standardization by 3D-SSP was not accurate enough at approximately 25° or more head tilt. When the simulated head contained perfusion defective regions, one of the 3D-SSP images showed an error of 6.9% from the true value. Meanwhile, one of the eZIS images showed an error as large as 63.4%, revealing a significant underestimation. When required to evaluate regions with decreased perfusion due to such causes as hemodynamic cerebral ischemia, the 3D-SSP is desirable. In a statistical image analysis, we must reconfirm the image after anatomical standardization by all means.

  14. Hospital inpatient self-administration of medicine programmes: a critical literature review.

    PubMed

    Wright, Julia; Emerson, Angela; Stephens, Martin; Lennan, Elaine

    2006-06-01

    The Department of Health, pharmaceutical and nursing bodies have advocated the benefits of self-administration programmes (SAPs), but their implementation within UK hospitals has been limited. Perceived barriers are: anticipated increased workload, insufficient resources and patient safety concerns. This review aims to discover if benefits of SAPs are supported in the literature in relation to risk and resource implications. Electronic databases were searched up to March 2004. Published English language articles that described and evaluated implementation of an SAP were included. Outcomes reported were: compliance measures, errors, knowledge, patient satisfaction, and nursing and pharmacy time. Most of the 51 papers reviewed had methodological flaws. SAPs varied widely in content and structure. Twelve studies (10 controlled) measured compliance by tablet counts. Of 7 studies subjected to statistical analysis, four demonstrated a significant difference in compliance between SAP and controls. Eight studies (5 controlled) measured errors as an outcome. Of the two evaluated statistically, only one demonstrated significantly fewer medication errors in the SAP group than in controls. Seventeen papers (11 controlled) studied the effect of SAPs on patients' medication knowledge. Ten of the 11 statistically analysed studies showed that SAP participants knew significantly more about some aspects of their medication than did controls. Seventeen studies (5 controlled), measured patient satisfaction. Two studies were statistically analysed and these studies suggested that patients were satisfied and preferred SAP. Seven papers studied pharmacy time, three studied nursing time but results were not compared to controls. The paucity of well-designed studies, flawed methodology and inadequate reporting in many papers make conclusions hard to draw. Conclusive evidence that SAPs improve compliance was not provided. Although patients participating in SAPs make errors, small numbers of patients are often responsible for a large number of errors. Whilst most studies suggest that SAPs increase patient's knowledge in part, it is difficult to separate out the effect of the educational component of many SAPs. Most patients who participated in SAPs were satisfied with their care and many would choose to take part in a SAP in the future. No studies measured the total resource requirement of implementing and maintaining a SAP.

  15. Estimating Uncertainties in the Multi-Instrument SBUV Profile Ozone Merged Data Set

    NASA Technical Reports Server (NTRS)

    Frith, Stacey; Stolarski, Richard

    2015-01-01

    The MOD data set is uniquely qualified for use in long-term ozone analysis because of its long record, high spatial coverage, and consistent instrument design and algorithm. The estimated MOD uncertainty term significantly increases the uncertainty over the statistical error alone. Trends in the post-2000 period are generally positive in the upper stratosphere, but only significant at 1-1.6 hPa. Remaining uncertainties not yet included in the Monte Carlo model are Smoothing Error ( 1 from 10 to 1 hPa) Relative calibration uncertainty between N11 and N17Seasonal cycle differences between SBUV records.

  16. Issues with data and analyses: Errors, underlying themes, and potential solutions

    PubMed Central

    Allison, David B.

    2018-01-01

    Some aspects of science, taken at the broadest level, are universal in empirical research. These include collecting, analyzing, and reporting data. In each of these aspects, errors can and do occur. In this work, we first discuss the importance of focusing on statistical and data errors to continually improve the practice of science. We then describe underlying themes of the types of errors and postulate contributing factors. To do so, we describe a case series of relatively severe data and statistical errors coupled with surveys of some types of errors to better characterize the magnitude, frequency, and trends. Having examined these errors, we then discuss the consequences of specific errors or classes of errors. Finally, given the extracted themes, we discuss methodological, cultural, and system-level approaches to reducing the frequency of commonly observed errors. These approaches will plausibly contribute to the self-critical, self-correcting, ever-evolving practice of science, and ultimately to furthering knowledge. PMID:29531079

  17. Assessment of Metronidazole Susceptibility in Helicobacter pylori: Statistical Validation and Error Rate Analysis of Breakpoints Determined by the Disk Diffusion Test

    PubMed Central

    Chaves, Sandra; Gadanho, Mário; Tenreiro, Rogério; Cabrita, José

    1999-01-01

    Metronidazole susceptibility of 100 Helicobacter pylori strains was assessed by determining the inhibition zone diameters by disk diffusion test and the MICs by agar dilution and PDM Epsilometer test (E test). Linear regression analysis was performed, allowing the definition of significant linear relations, and revealed correlations of disk diffusion results with both E-test and agar dilution results (r2 = 0.88 and 0.81, respectively). No significant differences (P = 0.84) were found between MICs defined by E test and those defined by agar dilution, taken as a standard. Reproducibility comparison between E-test and disk diffusion tests showed that they are equivalent and with good precision. Two interpretative susceptibility schemes (with or without an intermediate class) were compared by an interpretative error rate analysis method. The susceptibility classification scheme that included the intermediate category was retained, and breakpoints were assessed for diffusion assay with 5-μg metronidazole disks. Strains with inhibition zone diameters less than 16 mm were defined as resistant (MIC > 8 μg/ml), those with zone diameters equal to or greater than 16 mm but less than 21 mm were considered intermediate (4 μg/ml < MIC ≤ 8 μg/ml), and those with zone diameters of 21 mm or greater were regarded as susceptible (MIC ≤ 4 μg/ml). Error rate analysis applied to this classification scheme showed occurrence frequencies of 1% for major errors and 7% for minor errors, when the results were compared to those obtained by agar dilution. No very major errors were detected, suggesting that disk diffusion might be a good alternative for determining the metronidazole sensitivity of H. pylori strains. PMID:10203543

  18. Test data analysis for concentrating photovoltaic arrays

    NASA Astrophysics Data System (ADS)

    Maish, A. B.; Cannon, J. E.

    A test data analysis approach for use with steady state efficiency measurements taken on concentrating photovoltaic arrays is presented. The analysis procedures can be used to identify based and erroneous data. The steps involved in analyzing the test data are screening the data, developing coefficients for the performance equation, analyzing statistics to ensure adequacy of the regression fit to the data, and plotting the data. In addition, this paper analyzes the sources and magnitudes of precision and bias errors that affect measurement accuracy are analyzed.

  19. On the Correct Analysis of the Foundations of Theoretical Physics

    NASA Astrophysics Data System (ADS)

    Kalanov, Temur Z.

    2007-04-01

    The problem of truth in science -- the most urgent problem of our time -- is discussed. The correct theoretical analysis of the foundations of theoretical physics is proposed. The principle of the unity of formal logic and rational dialectics is a methodological basis of the analysis. The main result is as follows: the generally accepted foundations of theoretical physics (i.e. Newtonian mechanics, Maxwell electrodynamics, thermodynamics, statistical physics and physical kinetics, the theory of relativity, quantum mechanics) contain the set of logical errors. These errors are explained by existence of the global cause: the errors are a collateral and inevitable result of the inductive way of cognition of the Nature, i.e. result of movement from formation of separate concepts to formation of the system of concepts. Consequently, theoretical physics enters the greatest crisis. It means that physics as a science of phenomenon leaves the progress stage for a science of essence (information). Acknowledgment: The books ``Surprises in Theoretical Physics'' (1979) and ``More Surprises in Theoretical Physics'' (1991) by Sir Rudolf Peierls stimulated my 25-year work.

  20. Mindtagger: A Demonstration of Data Labeling in Knowledge Base Construction.

    PubMed

    Shin, Jaeho; Ré, Christopher; Cafarella, Michael

    2015-08-01

    End-to-end knowledge base construction systems using statistical inference are enabling more people to automatically extract high-quality domain-specific information from unstructured data. As a result of deploying DeepDive framework across several domains, we found new challenges in debugging and improving such end-to-end systems to construct high-quality knowledge bases. DeepDive has an iterative development cycle in which users improve the data. To help our users, we needed to develop principles for analyzing the system's error as well as provide tooling for inspecting and labeling various data products of the system. We created guidelines for error analysis modeled after our colleagues' best practices, in which data labeling plays a critical role in every step of the analysis. To enable more productive and systematic data labeling, we created Mindtagger, a versatile tool that can be configured to support a wide range of tasks. In this demonstration, we show in detail what data labeling tasks are modeled in our error analysis guidelines and how each of them is performed using Mindtagger.

  1. Inference of median difference based on the Box-Cox model in randomized clinical trials.

    PubMed

    Maruo, K; Isogawa, N; Gosho, M

    2015-05-10

    In randomized clinical trials, many medical and biological measurements are not normally distributed and are often skewed. The Box-Cox transformation is a powerful procedure for comparing two treatment groups for skewed continuous variables in terms of a statistical test. However, it is difficult to directly estimate and interpret the location difference between the two groups on the original scale of the measurement. We propose a helpful method that infers the difference of the treatment effect on the original scale in a more easily interpretable form. We also provide statistical analysis packages that consistently include an estimate of the treatment effect, covariance adjustments, standard errors, and statistical hypothesis tests. The simulation study that focuses on randomized parallel group clinical trials with two treatment groups indicates that the performance of the proposed method is equivalent to or better than that of the existing non-parametric approaches in terms of the type-I error rate and power. We illustrate our method with cluster of differentiation 4 data in an acquired immune deficiency syndrome clinical trial. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Investigating the role of background and observation error correlations in improving a model forecast of forest carbon balance using four dimensional variational data assimilation.

    NASA Astrophysics Data System (ADS)

    Pinnington, Ewan; Casella, Eric; Dance, Sarah; Lawless, Amos; Morison, James; Nichols, Nancy; Wilkinson, Matthew; Quaife, Tristan

    2016-04-01

    Forest ecosystems play an important role in sequestering human emitted carbon-dioxide from the atmosphere and therefore greatly reduce the effect of anthropogenic induced climate change. For that reason understanding their response to climate change is of great importance. Efforts to implement variational data assimilation routines with functional ecology models and land surface models have been limited, with sequential and Markov chain Monte Carlo data assimilation methods being prevalent. When data assimilation has been used with models of carbon balance, background "prior" errors and observation errors have largely been treated as independent and uncorrelated. Correlations between background errors have long been known to be a key aspect of data assimilation in numerical weather prediction. More recently, it has been shown that accounting for correlated observation errors in the assimilation algorithm can considerably improve data assimilation results and forecasts. In this paper we implement a 4D-Var scheme with a simple model of forest carbon balance, for joint parameter and state estimation and assimilate daily observations of Net Ecosystem CO2 Exchange (NEE) taken at the Alice Holt forest CO2 flux site in Hampshire, UK. We then investigate the effect of specifying correlations between parameter and state variables in background error statistics and the effect of specifying correlations in time between observation error statistics. The idea of including these correlations in time is new and has not been previously explored in carbon balance model data assimilation. In data assimilation, background and observation error statistics are often described by the background error covariance matrix and the observation error covariance matrix. We outline novel methods for creating correlated versions of these matrices, using a set of previously postulated dynamical constraints to include correlations in the background error statistics and a Gaussian correlation function to include time correlations in the observation error statistics. The methods used in this paper will allow the inclusion of time correlations between many different observation types in the assimilation algorithm, meaning that previously neglected information can be accounted for. In our experiments we compared the results using our new correlated background and observation error covariance matrices and those using diagonal covariance matrices. We found that using the new correlated matrices reduced the root mean square error in the 14 year forecast of daily NEE by 44 % decreasing from 4.22 g C m-2 day-1 to 2.38 g C m-2 day-1.

  3. Statistical Optimality in Multipartite Ranking and Ordinal Regression.

    PubMed

    Uematsu, Kazuki; Lee, Yoonkyung

    2015-05-01

    Statistical optimality in multipartite ranking is investigated as an extension of bipartite ranking. We consider the optimality of ranking algorithms through minimization of the theoretical risk which combines pairwise ranking errors of ordinal categories with differential ranking costs. The extension shows that for a certain class of convex loss functions including exponential loss, the optimal ranking function can be represented as a ratio of weighted conditional probability of upper categories to lower categories, where the weights are given by the misranking costs. This result also bridges traditional ranking methods such as proportional odds model in statistics with various ranking algorithms in machine learning. Further, the analysis of multipartite ranking with different costs provides a new perspective on non-smooth list-wise ranking measures such as the discounted cumulative gain and preference learning. We illustrate our findings with simulation study and real data analysis.

  4. Structure-Specific Statistical Mapping of White Matter Tracts

    PubMed Central

    Yushkevich, Paul A.; Zhang, Hui; Simon, Tony; Gee, James C.

    2008-01-01

    We present a new model-based framework for the statistical analysis of diffusion imaging data associated with specific white matter tracts. The framework takes advantage of the fact that several of the major white matter tracts are thin sheet-like structures that can be effectively modeled by medial representations. The approach involves segmenting major tracts and fitting them with deformable geometric medial models. The medial representation makes it possible to average and combine tensor-based features along directions locally perpendicular to the tracts, thus reducing data dimensionality and accounting for errors in normalization. The framework enables the analysis of individual white matter structures, and provides a range of possibilities for computing statistics and visualizing differences between cohorts. The framework is demonstrated in a study of white matter differences in pediatric chromosome 22q11.2 deletion syndrome. PMID:18407524

  5. Bayesian adjustment for measurement error in continuous exposures in an individually matched case-control study.

    PubMed

    Espino-Hernandez, Gabriela; Gustafson, Paul; Burstyn, Igor

    2011-05-14

    In epidemiological studies explanatory variables are frequently subject to measurement error. The aim of this paper is to develop a Bayesian method to correct for measurement error in multiple continuous exposures in individually matched case-control studies. This is a topic that has not been widely investigated. The new method is illustrated using data from an individually matched case-control study of the association between thyroid hormone levels during pregnancy and exposure to perfluorinated acids. The objective of the motivating study was to examine the risk of maternal hypothyroxinemia due to exposure to three perfluorinated acids measured on a continuous scale. Results from the proposed method are compared with those obtained from a naive analysis. Using a Bayesian approach, the developed method considers a classical measurement error model for the exposures, as well as the conditional logistic regression likelihood as the disease model, together with a random-effect exposure model. Proper and diffuse prior distributions are assigned, and results from a quality control experiment are used to estimate the perfluorinated acids' measurement error variability. As a result, posterior distributions and 95% credible intervals of the odds ratios are computed. A sensitivity analysis of method's performance in this particular application with different measurement error variability was performed. The proposed Bayesian method to correct for measurement error is feasible and can be implemented using statistical software. For the study on perfluorinated acids, a comparison of the inferences which are corrected for measurement error to those which ignore it indicates that little adjustment is manifested for the level of measurement error actually exhibited in the exposures. Nevertheless, a sensitivity analysis shows that more substantial adjustments arise if larger measurement errors are assumed. In individually matched case-control studies, the use of conditional logistic regression likelihood as a disease model in the presence of measurement error in multiple continuous exposures can be justified by having a random-effect exposure model. The proposed method can be successfully implemented in WinBUGS to correct individually matched case-control studies for several mismeasured continuous exposures under a classical measurement error model.

  6. An assessment of the cultivated cropland class of NLCD 2006 using a multi-source and multi-criteria approach

    USGS Publications Warehouse

    Danielson, Patrick; Yang, Limin; Jin, Suming; Homer, Collin G.; Napton, Darrell

    2016-01-01

    We developed a method that analyzes the quality of the cultivated cropland class mapped in the USA National Land Cover Database (NLCD) 2006. The method integrates multiple geospatial datasets and a Multi Index Integrated Change Analysis (MIICA) change detection method that captures spectral changes to identify the spatial distribution and magnitude of potential commission and omission errors for the cultivated cropland class in NLCD 2006. The majority of the commission and omission errors in NLCD 2006 are in areas where cultivated cropland is not the most dominant land cover type. The errors are primarily attributed to the less accurate training dataset derived from the National Agricultural Statistics Service Cropland Data Layer dataset. In contrast, error rates are low in areas where cultivated cropland is the dominant land cover. Agreement between model-identified commission errors and independently interpreted reference data was high (79%). Agreement was low (40%) for omission error comparison. The majority of the commission errors in the NLCD 2006 cultivated crops were confused with low-intensity developed classes, while the majority of omission errors were from herbaceous and shrub classes. Some errors were caused by inaccurate land cover change from misclassification in NLCD 2001 and the subsequent land cover post-classification process.

  7. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    PubMed Central

    Matochko, Wadim L.; Derda, Ratmir

    2013-01-01

    Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||ni||, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a IN, where IN is a N × N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. PMID:24416071

  8. Analysis and meta-analysis of single-case designs: an introduction.

    PubMed

    Shadish, William R

    2014-04-01

    The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  9. Statistical Analysis of Hit/Miss Data (Preprint)

    DTIC Science & Technology

    2012-07-01

    HDBK-1823A, 2009). Other agencies and industries have also made use of this guidance (Gandossi et al., 2010) and ( Drury et al., 2006). It should...better accounting of false call rates such that the POD curve doesn’t converge to 0 for small flaw sizes. The difficulty with conventional methods...2002. Drury , Ghylin, and Holness, Error Analysis and Threat Magnitude for Carry-on Bag Inspection, Proceedings of the Human Factors and Ergonomic

  10. Statistical analysis of fNIRS data: a comprehensive review.

    PubMed

    Tak, Sungho; Ye, Jong Chul

    2014-01-15

    Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Monte Carlo Simulations Comparing Fisher Exact Test and Unequal Variances t Test for Analysis of Differences Between Groups in Brief Hospital Lengths of Stay.

    PubMed

    Dexter, Franklin; Bayman, Emine O; Dexter, Elisabeth U

    2017-12-01

    We examined type I and II error rates for analysis of (1) mean hospital length of stay (LOS) versus (2) percentage of hospital LOS that are overnight. These 2 end points are suitable for when LOS is treated as a secondary economic end point. We repeatedly resampled LOS for 5052 discharges of thoracoscopic wedge resections and lung lobectomy at 26 hospitals. Unequal variances t test (Welch method) and Fisher exact test both were conservative (ie, type I error rate less than nominal level). The Wilcoxon rank sum test was included as a comparator; the type I error rates did not differ from the nominal level of 0.05 or 0.01. Fisher exact test was more powerful than the unequal variances t test at detecting differences among hospitals; estimated odds ratio for obtaining P < .05 with Fisher exact test versus unequal variances t test = 1.94, with 95% confidence interval, 1.31-3.01. Fisher exact test and Wilcoxon-Mann-Whitney had comparable statistical power in terms of differentiating LOS between hospitals. For studies with LOS to be used as a secondary end point of economic interest, there is currently considerable interest in the planned analysis being for the percentage of patients suitable for ambulatory surgery (ie, hospital LOS equals 0 or 1 midnight). Our results show that there need not be a loss of statistical power when groups are compared using this binary end point, as compared with either Welch method or Wilcoxon rank sum test.

  12. Analysis of error-correction constraints in an optical disk.

    PubMed

    Roberts, J D; Ryley, A; Jones, D M; Burke, D

    1996-07-10

    The compact disk read-only memory (CD-ROM) is a mature storage medium with complex error control. It comprises four levels of Reed Solomon codes allied to a sequence of sophisticated interleaving strategies and 8:14 modulation coding. New storage media are being developed and introduced that place still further demands on signal processing for error correction. It is therefore appropriate to explore thoroughly the limit of existing strategies to assess future requirements. We describe a simulation of all stages of the CD-ROM coding, modulation, and decoding. The results of decoding the burst error of a prescribed number of modulation bits are discussed in detail. Measures of residual uncorrected error within a sector are displayed by C1, C2, P, and Q error counts and by the status of the final cyclic redundancy check (CRC). Where each data sector is encoded separately, it is shown that error-correction performance against burst errors depends critically on the position of the burst within a sector. The C1 error measures the burst length, whereas C2 errors reflect the burst position. The performance of Reed Solomon product codes is shown by the P and Q statistics. It is shown that synchronization loss is critical near the limits of error correction. An example is given of miscorrection that is identified by the CRC check.

  13. Analysis of error-correction constraints in an optical disk

    NASA Astrophysics Data System (ADS)

    Roberts, Jonathan D.; Ryley, Alan; Jones, David M.; Burke, David

    1996-07-01

    The compact disk read-only memory (CD-ROM) is a mature storage medium with complex error control. It comprises four levels of Reed Solomon codes allied to a sequence of sophisticated interleaving strategies and 8:14 modulation coding. New storage media are being developed and introduced that place still further demands on signal processing for error correction. It is therefore appropriate to explore thoroughly the limit of existing strategies to assess future requirements. We describe a simulation of all stages of the CD-ROM coding, modulation, and decoding. The results of decoding the burst error of a prescribed number of modulation bits are discussed in detail. Measures of residual uncorrected error within a sector are displayed by C1, C2, P, and Q error counts and by the status of the final cyclic redundancy check (CRC). Where each data sector is encoded separately, it is shown that error-correction performance against burst errors depends critically on the position of the burst within a sector. The C1 error measures the burst length, whereas C2 errors reflect the burst position. The performance of Reed Solomon product codes is shown by the P and Q statistics. It is shown that synchronization loss is critical near the limits of error correction. An example is given of miscorrection that is identified by the CRC check.

  14. Statistical methods for astronomical data with upper limits. II - Correlation and regression

    NASA Technical Reports Server (NTRS)

    Isobe, T.; Feigelson, E. D.; Nelson, P. I.

    1986-01-01

    Statistical methods for calculating correlations and regressions in bivariate censored data where the dependent variable can have upper or lower limits are presented. Cox's regression and the generalization of Kendall's rank correlation coefficient provide significant levels of correlations, and the EM algorithm, under the assumption of normally distributed errors, and its nonparametric analog using the Kaplan-Meier estimator, give estimates for the slope of a regression line. Monte Carlo simulations demonstrate that survival analysis is reliable in determining correlations between luminosities at different bands. Survival analysis is applied to CO emission in infrared galaxies, X-ray emission in radio galaxies, H-alpha emission in cooling cluster cores, and radio emission in Seyfert galaxies.

  15. Statistical Significance of Optical Map Alignments

    PubMed Central

    Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C.

    2012-01-01

    Abstract The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities. PMID:22506568

  16. Measurement-device-independent quantum key distribution with source state errors and statistical fluctuation

    NASA Astrophysics Data System (ADS)

    Jiang, Cong; Yu, Zong-Wen; Wang, Xiang-Bin

    2017-03-01

    We show how to calculate the secure final key rate in the four-intensity decoy-state measurement-device-independent quantum key distribution protocol with both source errors and statistical fluctuations with a certain failure probability. Our results rely only on the range of only a few parameters in the source state. All imperfections in this protocol have been taken into consideration without assuming any specific error patterns of the source.

  17. Estimating Uncertainty in Long Term Total Ozone Records from Multiple Sources

    NASA Technical Reports Server (NTRS)

    Frith, Stacey M.; Stolarski, Richard S.; Kramarova, Natalya; McPeters, Richard D.

    2014-01-01

    Total ozone measurements derived from the TOMS and SBUV backscattered solar UV instrument series cover the period from late 1978 to the present. As the SBUV series of instruments comes to an end, we look to the 10 years of data from the AURA Ozone Monitoring Instrument (OMI) and two years of data from the Ozone Mapping Profiler Suite (OMPS) on board the Suomi National Polar-orbiting Partnership satellite to continue the record. When combining these records to construct a single long-term data set for analysis we must estimate the uncertainty in the record resulting from potential biases and drifts in the individual measurement records. In this study we present a Monte Carlo analysis used to estimate uncertainties in the Merged Ozone Dataset (MOD), constructed from the Version 8.6 SBUV2 series of instruments. We extend this analysis to incorporate OMI and OMPS total ozone data into the record and investigate the impact of multiple overlapping measurements on the estimated error. We also present an updated column ozone trend analysis and compare the size of statistical error (error from variability not explained by our linear regression model) to that from instrument uncertainty.

  18. Publication bias was not a good reason to discourage trials with low power.

    PubMed

    Borm, George F; den Heijer, Martin; Zielhuis, Gerhard A

    2009-01-01

    The objective was to investigate whether it is justified to discourage trials with less than 80% power. Trials with low power are unlikely to produce conclusive results, but their findings can be used by pooling then in a meta-analysis. However, such an analysis may be biased, because trials with low power are likely to have a nonsignificant result and are less likely to be published than trials with a statistically significant outcome. We simulated several series of studies with varying degrees of publication bias and then calculated the "real" one-sided type I error and the bias of meta-analyses with a "nominal" error rate (significance level) of 2.5%. In single trials, in which heterogeneity was set at zero, low, and high, the error rates were 2.3%, 4.7%, and 16.5%, respectively. In multiple trials with 80%-90% power and a publication rate of 90% when the results were nonsignificant, the error rates could be as high as 5.1%. When the power was 50% and the publication rate of non-significant results was 60%, the error rates did not exceed 5.3%, whereas the bias was at most 15% of the difference used in the power calculation. The impact of publication bias does not warrant the exclusion of trials with 50% power.

  19. Combining empirical approaches and error modelling to enhance predictive uncertainty estimation in extrapolation for operational flood forecasting. Tests on flood events on the Loire basin, France.

    NASA Astrophysics Data System (ADS)

    Berthet, Lionel; Marty, Renaud; Bourgin, François; Viatgé, Julie; Piotte, Olivier; Perrin, Charles

    2017-04-01

    An increasing number of operational flood forecasting centres assess the predictive uncertainty associated with their forecasts and communicate it to the end users. This information can match the end-users needs (i.e. prove to be useful for an efficient crisis management) only if it is reliable: reliability is therefore a key quality for operational flood forecasts. In 2015, the French flood forecasting national and regional services (Vigicrues network; www.vigicrues.gouv.fr) implemented a framework to compute quantitative discharge and water level forecasts and to assess the predictive uncertainty. Among the possible technical options to achieve this goal, a statistical analysis of past forecasting errors of deterministic models has been selected (QUOIQUE method, Bourgin, 2014). It is a data-based and non-parametric approach based on as few assumptions as possible about the forecasting error mathematical structure. In particular, a very simple assumption is made regarding the predictive uncertainty distributions for large events outside the range of the calibration data: the multiplicative error distribution is assumed to be constant, whatever the magnitude of the flood. Indeed, the predictive distributions may not be reliable in extrapolation. However, estimating the predictive uncertainty for these rare events is crucial when major floods are of concern. In order to improve the forecasts reliability for major floods, an attempt at combining the operational strength of the empirical statistical analysis and a simple error modelling is done. Since the heteroscedasticity of forecast errors can considerably weaken the predictive reliability for large floods, this error modelling is based on the log-sinh transformation which proved to reduce significantly the heteroscedasticity of the transformed error in a simulation context, even for flood peaks (Wang et al., 2012). Exploratory tests on some operational forecasts issued during the recent floods experienced in France (major spring floods in June 2016 on the Loire river tributaries and flash floods in fall 2016) will be shown and discussed. References Bourgin, F. (2014). How to assess the predictive uncertainty in hydrological modelling? An exploratory work on a large sample of watersheds, AgroParisTech Wang, Q. J., Shrestha, D. L., Robertson, D. E. and Pokhrel, P (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, , W05514, doi:10.1029/2011WR010973

  20. Understanding seasonal variability of uncertainty in hydrological prediction

    NASA Astrophysics Data System (ADS)

    Li, M.; Wang, Q. J.

    2012-04-01

    Understanding uncertainty in hydrological prediction can be highly valuable for improving the reliability of streamflow prediction. In this study, a monthly water balance model, WAPABA, in a Bayesian joint probability with error models are presented to investigate the seasonal dependency of prediction error structure. A seasonal invariant error model, analogous to traditional time series analysis, uses constant parameters for model error and account for no seasonal variations. In contrast, a seasonal variant error model uses a different set of parameters for bias, variance and autocorrelation for each individual calendar month. Potential connection amongst model parameters from similar months is not considered within the seasonal variant model and could result in over-fitting and over-parameterization. A hierarchical error model further applies some distributional restrictions on model parameters within a Bayesian hierarchical framework. An iterative algorithm is implemented to expedite the maximum a posterior (MAP) estimation of a hierarchical error model. Three error models are applied to forecasting streamflow at a catchment in southeast Australia in a cross-validation analysis. This study also presents a number of statistical measures and graphical tools to compare the predictive skills of different error models. From probability integral transform histograms and other diagnostic graphs, the hierarchical error model conforms better to reliability when compared to the seasonal invariant error model. The hierarchical error model also generally provides the most accurate mean prediction in terms of the Nash-Sutcliffe model efficiency coefficient and the best probabilistic prediction in terms of the continuous ranked probability score (CRPS). The model parameters of the seasonal variant error model are very sensitive to each cross validation, while the hierarchical error model produces much more robust and reliable model parameters. Furthermore, the result of the hierarchical error model shows that most of model parameters are not seasonal variant except for error bias. The seasonal variant error model is likely to use more parameters than necessary to maximize the posterior likelihood. The model flexibility and robustness indicates that the hierarchical error model has great potential for future streamflow predictions.

  1. A two-factor error model for quantitative steganalysis

    NASA Astrophysics Data System (ADS)

    Böhme, Rainer; Ker, Andrew D.

    2006-02-01

    Quantitative steganalysis refers to the exercise not only of detecting the presence of hidden stego messages in carrier objects, but also of estimating the secret message length. This problem is well studied, with many detectors proposed but only a sparse analysis of errors in the estimators. A deep understanding of the error model, however, is a fundamental requirement for the assessment and comparison of different detection methods. This paper presents a rationale for a two-factor model for sources of error in quantitative steganalysis, and shows evidence from a dedicated large-scale nested experimental set-up with a total of more than 200 million attacks. Apart from general findings about the distribution functions found in both classes of errors, their respective weight is determined, and implications for statistical hypothesis tests in benchmarking scenarios or regression analyses are demonstrated. The results are based on a rigorous comparison of five different detection methods under many different external conditions, such as size of the carrier, previous JPEG compression, and colour channel selection. We include analyses demonstrating the effects of local variance and cover saturation on the different sources of error, as well as presenting the case for a relative bias model for between-image error.

  2. Statistical flaws in design and analysis of fertility treatment studies on cryopreservation raise doubts on the conclusions

    PubMed Central

    van Gelder, P.H.A.J.M.; Nijs, M.

    2011-01-01

    Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care. PMID:24753877

  3. Statistical flaws in design and analysis of fertility treatment -studies on cryopreservation raise doubts on the conclusions.

    PubMed

    van Gelder, P H A J M; Nijs, M

    2011-01-01

    Decisions about pharmacotherapy are being taken by medical doctors and authorities based on comparative studies on the use of medications. In studies on fertility treatments in particular, the methodological quality is of utmost -importance in the application of evidence-based medicine and systematic reviews. Nevertheless, flaws and omissions appear quite regularly in these types of studies. Current study aims to present an overview of some of the typical statistical flaws, illustrated by a number of example studies which have been published in peer reviewed journals. Based on an investigation of eleven studies at random selected on fertility treatments with cryopreservation, it appeared that the methodological quality of these studies often did not fulfil the -required statistical criteria. The following statistical flaws were identified: flaws in study design, patient selection, and units of analysis or in the definition of the primary endpoints. Other errors could be found in p-value and power calculations or in critical p-value definitions. Proper -interpretation of the results and/or use of these study results in a meta analysis should therefore be conducted with care.

  4. Moisture Forecast Bias Correction in GEOS DAS

    NASA Technical Reports Server (NTRS)

    Dee, D.

    1999-01-01

    Data assimilation methods rely on numerous assumptions about the errors involved in measuring and forecasting atmospheric fields. One of the more disturbing of these is that short-term model forecasts are assumed to be unbiased. In case of atmospheric moisture, for example, observational evidence shows that the systematic component of errors in forecasts and analyses is often of the same order of magnitude as the random component. we have implemented a sequential algorithm for estimating forecast moisture bias from rawinsonde data in the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The algorithm is designed to remove the systematic component of analysis errors and can be easily incorporated in an existing statistical data assimilation system. We will present results of initial experiments that show a significant reduction of bias in the GEOS DAS moisture analyses.

  5. Identification of dynamic systems, theory and formulation

    NASA Technical Reports Server (NTRS)

    Maine, R. E.; Iliff, K. W.

    1985-01-01

    The problem of estimating parameters of dynamic systems is addressed in order to present the theoretical basis of system identification and parameter estimation in a manner that is complete and rigorous, yet understandable with minimal prerequisites. Maximum likelihood and related estimators are highlighted. The approach used requires familiarity with calculus, linear algebra, and probability, but does not require knowledge of stochastic processes or functional analysis. The treatment emphasizes unification of the various areas in estimation in dynamic systems is treated as a direct outgrowth of the static system theory. Topics covered include basic concepts and definitions; numerical optimization methods; probability; statistical estimators; estimation in static systems; stochastic processes; state estimation in dynamic systems; output error, filter error, and equation error methods of parameter estimation in dynamic systems, and the accuracy of the estimates.

  6. Eta Squared, Partial Eta Squared, and Misreporting of Effect Size in Communication Research.

    ERIC Educational Resources Information Center

    Levine, Timothy R.; Hullett, Craig R.

    2002-01-01

    Alerts communication researchers to potential errors stemming from the use of SPSS (Statistical Package for the Social Sciences) to obtain estimates of eta squared in analysis of variance (ANOVA). Strives to clarify issues concerning the development and appropriate use of eta squared and partial eta squared in ANOVA. Discusses the reporting of…

  7. Fisher, Sir Ronald Aylmer (1890-1962)

    NASA Astrophysics Data System (ADS)

    Murdin, P.

    2000-11-01

    Statistician, born in London, England. After studying astronomy using AIRY's manual on the Theory of Errors he became interested in statistics, and laid the foundation of randomization in experimental design, the analysis of variance and the use of data in estimating the properties of the parent population from which it was drawn. Invented the maximum likelihood method for estimating from random ...

  8. NHEXAS PHASE I MARYLAND STUDY--STANDARD OPERATING PROCEDURE FOR EXPLORATORY DATA ANALYSIS AND SUMMARY STATISTICS (D05)

    EPA Science Inventory

    This SOP describes the methods and procedures for two types of QA procedures: spot checks of hand entered data, and QA procedures for co-located and split samples. The spot checks were used to determine whether the error rate goal for the input of hand entered data was being att...

  9. The extended statistical analysis of toxicity tests using standardised effect sizes (SESs): a comparison of nine published papers.

    PubMed

    Festing, Michael F W

    2014-01-01

    The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error) due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs) a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL) was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.

  10. How to Create Automatically Graded Spreadsheets for Statistics Courses

    ERIC Educational Resources Information Center

    LoSchiavo, Frank M.

    2016-01-01

    Instructors often use spreadsheet software (e.g., Microsoft Excel) in their statistics courses so that students can gain experience conducting computerized analyses. Unfortunately, students tend to make several predictable errors when programming spreadsheets. Without immediate feedback, programming errors are likely to go undetected, and as a…

  11. Analysis of counting data: Development of the SATLAS Python package

    NASA Astrophysics Data System (ADS)

    Gins, W.; de Groote, R. P.; Bissell, M. L.; Granados Buitrago, C.; Ferrer, R.; Lynch, K. M.; Neyens, G.; Sels, S.

    2018-01-01

    For the analysis of low-statistics counting experiments, a traditional nonlinear least squares minimization routine may not always provide correct parameter and uncertainty estimates due to the assumptions inherent in the algorithm(s). In response to this, a user-friendly Python package (SATLAS) was written to provide an easy interface between the data and a variety of minimization algorithms which are suited for analyzinglow, as well as high, statistics data. The advantage of this package is that it allows the user to define their own model function and then compare different minimization routines to determine the optimal parameter values and their respective (correlated) errors. Experimental validation of the different approaches in the package is done through analysis of hyperfine structure data of 203Fr gathered by the CRIS experiment at ISOLDE, CERN.

  12. Assessing colour-dependent occupation statistics inferred from galaxy group catalogues

    NASA Astrophysics Data System (ADS)

    Campbell, Duncan; van den Bosch, Frank C.; Hearin, Andrew; Padmanabhan, Nikhil; Berlind, Andreas; Mo, H. J.; Tinker, Jeremy; Yang, Xiaohu

    2015-09-01

    We investigate the ability of current implementations of galaxy group finders to recover colour-dependent halo occupation statistics. To test the fidelity of group catalogue inferred statistics, we run three different group finders used in the literature over a mock that includes galaxy colours in a realistic manner. Overall, the resulting mock group catalogues are remarkably similar, and most colour-dependent statistics are recovered with reasonable accuracy. However, it is also clear that certain systematic errors arise as a consequence of correlated errors in group membership determination, central/satellite designation, and halo mass assignment. We introduce a new statistic, the halo transition probability (HTP), which captures the combined impact of all these errors. As a rule of thumb, errors tend to equalize the properties of distinct galaxy populations (i.e. red versus blue galaxies or centrals versus satellites), and to result in inferred occupation statistics that are more accurate for red galaxies than for blue galaxies. A statistic that is particularly poorly recovered from the group catalogues is the red fraction of central galaxies as a function of halo mass. Group finders do a good job in recovering galactic conformity, but also have a tendency to introduce weak conformity when none is present. We conclude that proper inference of colour-dependent statistics from group catalogues is best achieved using forward modelling (i.e. running group finders over mock data) or by implementing a correction scheme based on the HTP, as long as the latter is not too strongly model dependent.

  13. Automatic pedicles detection using convolutional neural network in a 3D spine reconstruction from biplanar radiographs

    NASA Astrophysics Data System (ADS)

    Bakhous, Christine; Aubert, Benjamin; Vazquez, Carlos; Cresson, Thierry; Parent, Stefan; De Guise, Jacques

    2018-02-01

    The 3D analysis of the spine deformities (scoliosis) has a high potential in its clinical diagnosis and treatment. In a biplanar radiographs context, a 3D analysis requires a 3D reconstruction from a pair of 2D X-rays. Whether being fully-/semiautomatic or manual, this task is complex because of the noise, the structure superimposition and partial information due to a limited projections number. Being involved in the axial vertebra rotation (AVR), which is a fundamental clinical parameter for scoliosis diagnosis, pedicles are important landmarks for the 3D spine modeling and pre-operative planning. In this paper, we focus on the extension of a fully-automatic 3D spine reconstruction method where the Vertebral Body Centers (VBCs) are automatically detected using Convolutional Neural Network (CNN) and then regularized using a Statistical Shape Model (SSM) framework. In this global process, pedicles are inferred statistically during the SSM regularization. Our contribution is to add a CNN-based regression model for pedicle detection allowing a better pedicle localization and improving the clinical parameters estimation (e.g. AVR, Cobb angle). Having 476 datasets including healthy patients and Adolescent Idiopathic Scoliosis (AIS) cases with different scoliosis grades (Cobb angles up to 116°), we used 380 for training, 48 for testing and 48 for validation. Adding the local CNN-based pedicle detection decreases the mean absolute error of the AVR by 10%. The 3D mean Euclidian distance error between detected pedicles and ground truth decreases by 17% and the maximum error by 19%. Moreover, a general improvement is observed in the 3D spine reconstruction and reflected in lower errors on the Cobb angle estimation.

  14. Errors in causal inference: an organizational schema for systematic error and random error.

    PubMed

    Suzuki, Etsuji; Tsuda, Toshihide; Mitsuhashi, Toshiharu; Mansournia, Mohammad Ali; Yamamoto, Eiji

    2016-11-01

    To provide an organizational schema for systematic error and random error in estimating causal measures, aimed at clarifying the concept of errors from the perspective of causal inference. We propose to divide systematic error into structural error and analytic error. With regard to random error, our schema shows its four major sources: nondeterministic counterfactuals, sampling variability, a mechanism that generates exposure events and measurement variability. Structural error is defined from the perspective of counterfactual reasoning and divided into nonexchangeability bias (which comprises confounding bias and selection bias) and measurement bias. Directed acyclic graphs are useful to illustrate this kind of error. Nonexchangeability bias implies a lack of "exchangeability" between the selected exposed and unexposed groups. A lack of exchangeability is not a primary concern of measurement bias, justifying its separation from confounding bias and selection bias. Many forms of analytic errors result from the small-sample properties of the estimator used and vanish asymptotically. Analytic error also results from wrong (misspecified) statistical models and inappropriate statistical methods. Our organizational schema is helpful for understanding the relationship between systematic error and random error from a previously less investigated aspect, enabling us to better understand the relationship between accuracy, validity, and precision. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Type I and Type II error concerns in fMRI research: re-balancing the scale

    PubMed Central

    Cunningham, William A.

    2009-01-01

    Statistical thresholding (i.e. P-values) in fMRI research has become increasingly conservative over the past decade in an attempt to diminish Type I errors (i.e. false alarms) to a level traditionally allowed in behavioral science research. In this article, we examine the unintended negative consequences of this single-minded devotion to Type I errors: increased Type II errors (i.e. missing true effects), a bias toward studying large rather than small effects, a bias toward observing sensory and motor processes rather than complex cognitive and affective processes and deficient meta-analyses. Power analyses indicate that the reductions in acceptable P-values over time are producing dramatic increases in the Type II error rate. Moreover, the push for a mapwide false discovery rate (FDR) of 0.05 is based on the assumption that this is the FDR in most behavioral research; however, this is an inaccurate assessment of the conventions in actual behavioral research. We report simulations demonstrating that combined intensity and cluster size thresholds such as P < 0.005 with a 10 voxel extent produce a desirable balance between Types I and II error rates. This joint threshold produces high but acceptable Type II error rates and produces a FDR that is comparable to the effective FDR in typical behavioral science articles (while a 20 voxel extent threshold produces an actual FDR of 0.05 with relatively common imaging parameters). We recommend a greater focus on replication and meta-analysis rather than emphasizing single studies as the unit of analysis for establishing scientific truth. From this perspective, Type I errors are self-erasing because they will not replicate, thus allowing for more lenient thresholding to avoid Type II errors. PMID:20035017

  16. Toward the assimilation of biogeochemical data in the CMEMS BIOMER coupled physical-biogeochemical operational system

    NASA Astrophysics Data System (ADS)

    Lamouroux, Julien; Testut, Charles-Emmanuel; Lellouche, Jean-Michel; Perruche, Coralie; Paul, Julien

    2017-04-01

    The operational production of data-assimilated biogeochemical state of the ocean is one of the challenging core projects of the Copernicus Marine Environment Monitoring Service. In that framework - and with the April 2018 CMEMS V4 release as a target - Mercator Ocean is in charge of improving the realism of its global ¼° BIOMER coupled physical-biogeochemical (NEMO/PISCES) simulations, analyses and re-analyses, and to develop an effective capacity to routinely estimate the biogeochemical state of the ocean, through the implementation of biogeochemical data assimilation. Primary objectives are to enhance the time representation of the seasonal cycle in the real time and reanalysis systems, and to provide a better control of the production in the equatorial regions. The assimilation of BGC data will rely on a simplified version of the SEEK filter, where the error statistics do not evolve with the model dynamics. The associated forecast error covariances are based on the statistics of a collection of 3D ocean state anomalies. The anomalies are computed from a multi-year numerical experiment (free run without assimilation) with respect to a running mean in order to estimate the 7-day scale error on the ocean state at a given period of the year. These forecast error covariances rely thus on a fixed-basis seasonally variable ensemble of anomalies. This methodology, which is currently implemented in the "blue" component of the CMEMS operational forecast system, is now under adaptation to be applied to the biogeochemical part of the operational system. Regarding observations - and as a first step - the system shall rely on the CMEMS GlobColour Global Ocean surface chlorophyll concentration products, delivered in NRT. The objective of this poster is to provide a detailed overview of the implementation of the aforementioned data assimilation methodology in the CMEMS BIOMER forecasting system. Focus shall be put on (1) the assessment of the capabilities of this data assimilation methodology to provide satisfying statistics of the model variability errors (through space-time analysis of dedicated representers of satellite surface Chla observations), (2) the dedicated features of the data assimilation configuration that have been implemented so far (e.g. log-transformation of the analysis state, multivariate Chlorophyll-Nutrient control vector, etc.) and (3) the assessment of the performances of this future operational data assimilation configuration.

  17. Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2015-09-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.

  18. Comparing statistical and process-based flow duration curve models in ungauged basins and changing rain regimes

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2016-02-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.

  19. Real-Time Identification of Wheel Terrain Interaction Models for Enhanced Autonomous Vehicle Mobility

    DTIC Science & Technology

    2014-04-24

    tim at io n Er ro r ( cm ) 0 2 4 6 8 10 Color Statistics Angelova...Color_Statistics_Error) / Average_Slip_Error Position Estimation Error: Global Pose Po si tio n Es tim at io n Er ro r ( cm ) 0 2 4 6 8 10 12 Color...get some kind of clearance for releasing pose and odometry data) collected at the following sites – Taylor, Gascola, Somerset, Fort Bliss and

  20. Sample size determination in combinatorial chemistry.

    PubMed Central

    Zhao, P L; Zambias, R; Bolognese, J A; Boulton, D; Chapman, K

    1995-01-01

    Combinatorial chemistry is gaining wide appeal as a technique for generating molecular diversity. Among the many combinatorial protocols, the split/recombine method is quite popular and particularly efficient at generating large libraries of compounds. In this process, polymer beads are equally divided into a series of pools and each pool is treated with a unique fragment; then the beads are recombined, mixed to uniformity, and redivided equally into a new series of pools for the subsequent couplings. The deviation from the ideal equimolar distribution of the final products is assessed by a special overall relative error, which is shown to be related to the Pearson statistic. Although the split/recombine sampling scheme is quite different from those used in analysis of categorical data, the Pearson statistic is shown to still follow a chi2 distribution. This result allows us to derive the required number of beads such that, with 99% confidence, the overall relative error is controlled to be less than a pregiven tolerable limit L1. In this paper, we also discuss another criterion, which determines the required number of beads so that, with 99% confidence, all individual relative errors are controlled to be less than a pregiven tolerable limit L2 (0 < L2 < 1). PMID:11607586

  1. Evaluating methods of correcting for multiple comparisons implemented in SPM12 in social neuroscience fMRI studies: an example from moral psychology.

    PubMed

    Han, Hyemin; Glenn, Andrea L

    2018-06-01

    In fMRI research, the goal of correcting for multiple comparisons is to identify areas of activity that reflect true effects, and thus would be expected to replicate in future studies. Finding an appropriate balance between trying to minimize false positives (Type I error) while not being too stringent and omitting true effects (Type II error) can be challenging. Furthermore, the advantages and disadvantages of these types of errors may differ for different areas of study. In many areas of social neuroscience that involve complex processes and considerable individual differences, such as the study of moral judgment, effects are typically smaller and statistical power weaker, leading to the suggestion that less stringent corrections that allow for more sensitivity may be beneficial and also result in more false positives. Using moral judgment fMRI data, we evaluated four commonly used methods for multiple comparison correction implemented in Statistical Parametric Mapping 12 by examining which method produced the most precise overlap with results from a meta-analysis of relevant studies and with results from nonparametric permutation analyses. We found that voxelwise thresholding with familywise error correction based on Random Field Theory provides a more precise overlap (i.e., without omitting too few regions or encompassing too many additional regions) than either clusterwise thresholding, Bonferroni correction, or false discovery rate correction methods.

  2. Application of statistical machine translation to public health information: a feasibility study.

    PubMed

    Kirchhoff, Katrin; Turner, Anne M; Axelrod, Amittai; Saavedra, Francisco

    2011-01-01

    Accurate, understandable public health information is important for ensuring the health of the nation. The large portion of the US population with Limited English Proficiency is best served by translations of public-health information into other languages. However, a large number of health departments and primary care clinics face significant barriers to fulfilling federal mandates to provide multilingual materials to Limited English Proficiency individuals. This article presents a pilot study on the feasibility of using freely available statistical machine translation technology to translate health promotion materials. The authors gathered health-promotion materials in English from local and national public-health websites. Spanish versions were created by translating the documents using a freely available machine-translation website. Translations were rated for adequacy and fluency, analyzed for errors, manually corrected by a human posteditor, and compared with exclusively manual translations. Machine translation plus postediting took 15-53 min per document, compared to the reported days or even weeks for the standard translation process. A blind comparison of machine-assisted and human translations of six documents revealed overall equivalency between machine-translated and manually translated materials. The analysis of translation errors indicated that the most important errors were word-sense errors. The results indicate that machine translation plus postediting may be an effective method of producing multilingual health materials with equivalent quality but lower cost compared to manual translations.

  3. Author Self-disclosure Compared with Pharmaceutical Company Reporting of Physician Payments.

    PubMed

    Alhamoud, Hani A; Dudum, Ramzi; Young, Heather A; Choi, Brian G

    2016-01-01

    Industry manufacturers are required by the Sunshine Act to disclose payments to physicians. These data recently became publicly available, but some manufacturers prereleased their data since 2009. We tested the hypotheses that there would be discrepancies between manufacturers' and physicians' disclosures. The financial disclosures by authors of all 39 American College of Cardiology and American Heart Association guidelines between 2009 and 2012 were matched to the public disclosures of 15 pharmaceutical companies during that same period. Duplicate authors across guidelines were assessed independently. Per the guidelines, payments <$10,000 are modest and ≥$10,000 are significant. Agreement was determined using a κ statistic; Fisher's exact and Mann-Whitney tests were used to detect statistical significance. The overall agreement between author and company disclosure was poor (κ = 0.238). There was a significant difference in error rates of disclosure among companies and authors (P = .019). Of disclosures by authors, companies failed to match them with an error rate of 71.6%. Of disclosures by companies, authors failed to match them with an error rate of 54.7%. Our analysis shows a concerning level of disagreement between guideline authors' and pharmaceutical companies' disclosures. Without ability for physicians to challenge reports, it is unclear whether these discrepancies reflect undisclosed relationships with industry or errors in reporting, and caution should be advised in interpretation of data from the Sunshine Act. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Application of statistical machine translation to public health information: a feasibility study

    PubMed Central

    Turner, Anne M; Axelrod, Amittai; Saavedra, Francisco

    2011-01-01

    Objective Accurate, understandable public health information is important for ensuring the health of the nation. The large portion of the US population with Limited English Proficiency is best served by translations of public-health information into other languages. However, a large number of health departments and primary care clinics face significant barriers to fulfilling federal mandates to provide multilingual materials to Limited English Proficiency individuals. This article presents a pilot study on the feasibility of using freely available statistical machine translation technology to translate health promotion materials. Design The authors gathered health-promotion materials in English from local and national public-health websites. Spanish versions were created by translating the documents using a freely available machine-translation website. Translations were rated for adequacy and fluency, analyzed for errors, manually corrected by a human posteditor, and compared with exclusively manual translations. Results Machine translation plus postediting took 15–53 min per document, compared to the reported days or even weeks for the standard translation process. A blind comparison of machine-assisted and human translations of six documents revealed overall equivalency between machine-translated and manually translated materials. The analysis of translation errors indicated that the most important errors were word-sense errors. Conclusion The results indicate that machine translation plus postediting may be an effective method of producing multilingual health materials with equivalent quality but lower cost compared to manual translations. PMID:21498805

  5. [Again review of research design and statistical methods of Chinese Journal of Cardiology].

    PubMed

    Kong, Qun-yu; Yu, Jin-ming; Jia, Gong-xian; Lin, Fan-li

    2012-11-01

    To re-evaluate and compare the research design and the use of statistical methods in Chinese Journal of Cardiology. Summary the research design and statistical methods in all of the original papers in Chinese Journal of Cardiology all over the year of 2011, and compared the result with the evaluation of 2008. (1) There is no difference in the distribution of the design of researches of between the two volumes. Compared with the early volume, the use of survival regression and non-parameter test are increased, while decreased in the proportion of articles with no statistical analysis. (2) The proportions of articles in the later volume are significant lower than the former, such as 6(4%) with flaws in designs, 5(3%) with flaws in the expressions, 9(5%) with the incomplete of analysis. (3) The rate of correction of variance analysis has been increased, so as the multi-group comparisons and the test of normality. The error rate of usage has been decreased form 17% to 25% without significance in statistics due to the ignorance of the test of homogeneity of variance. Many improvements showed in Chinese Journal of Cardiology such as the regulation of the design and statistics. The homogeneity of variance should be paid more attention in the further application.

  6. Evaluation of the impact of observations on blended sea surface winds in a two-dimensional variational scheme using degrees of freedom

    NASA Astrophysics Data System (ADS)

    Wang, Ting; Xiang, Jie; Fei, Jianfang; Wang, Yi; Liu, Chunxia; Li, Yuanxiang

    2017-12-01

    This paper presents an evaluation of the observational impacts on blended sea surface winds from a two-dimensional variational data assimilation (2D-Var) scheme. We begin by briefly introducing the analysis sensitivity with respect to observations in variational data assimilation systems and its relationship with the degrees of freedom for signal (DFS), and then the DFS concept is applied to the 2D-Var sea surface wind blending scheme. Two methods, a priori and a posteriori, are used to estimate the DFS of the zonal ( u) and meridional ( v) components of winds in the 2D-Var blending scheme. The a posteriori method can obtain almost the same results as the a priori method. Because only by-products of the blending scheme are used for the a posteriori method, the computation time is reduced significantly. The magnitude of the DFS is critically related to the observational and background error statistics. Changing the observational and background error variances can affect the DFS value. Because the observation error variances are assumed to be uniform, the observational influence at each observational location is related to the background error variance, and the observations located at the place where there are larger background error variances have larger influences. The average observational influence of u and v with respect to the analysis is about 40%, implying that the background influence with respect to the analysis is about 60%.

  7. New Approaches to Quantifying Transport Model Error in Atmospheric CO2 Simulations

    NASA Technical Reports Server (NTRS)

    Ott, L.; Pawson, S.; Zhu, Z.; Nielsen, J. E.; Collatz, G. J.; Gregg, W. W.

    2012-01-01

    In recent years, much progress has been made in observing CO2 distributions from space. However, the use of these observations to infer source/sink distributions in inversion studies continues to be complicated by difficulty in quantifying atmospheric transport model errors. We will present results from several different experiments designed to quantify different aspects of transport error using the Goddard Earth Observing System, Version 5 (GEOS-5) Atmospheric General Circulation Model (AGCM). In the first set of experiments, an ensemble of simulations is constructed using perturbations to parameters in the model s moist physics and turbulence parameterizations that control sub-grid scale transport of trace gases. Analysis of the ensemble spread and scales of temporal and spatial variability among the simulations allows insight into how parameterized, small-scale transport processes influence simulated CO2 distributions. In the second set of experiments, atmospheric tracers representing model error are constructed using observation minus analysis statistics from NASA's Modern-Era Retrospective Analysis for Research and Applications (MERRA). The goal of these simulations is to understand how errors in large scale dynamics are distributed, and how they propagate in space and time, affecting trace gas distributions. These simulations will also be compared to results from NASA's Carbon Monitoring System Flux Pilot Project that quantified the impact of uncertainty in satellite constrained CO2 flux estimates on atmospheric mixing ratios to assess the major factors governing uncertainty in global and regional trace gas distributions.

  8. How allele frequency and study design affect association test statistics with misrepresentation errors.

    PubMed

    Escott-Price, Valentina; Ghodsi, Mansoureh; Schmidt, Karl Michael

    2014-04-01

    We evaluate the effect of genotyping errors on the type-I error of a general association test based on genotypes, showing that, in the presence of errors in the case and control samples, the test statistic asymptotically follows a scaled non-central $\\chi ^2$ distribution. We give explicit formulae for the scaling factor and non-centrality parameter for the symmetric allele-based genotyping error model and for additive and recessive disease models. They show how genotyping errors can lead to a significantly higher false-positive rate, growing with sample size, compared with the nominal significance levels. The strength of this effect depends very strongly on the population distribution of the genotype, with a pronounced effect in the case of rare alleles, and a great robustness against error in the case of large minor allele frequency. We also show how these results can be used to correct $p$-values.

  9. Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.

    PubMed

    Monica, Stefania; Ferrari, Gianluigi

    2018-05-17

    Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.

  10. Quantitative analysis of the correlations in the Boltzmann-Grad limit for hard spheres

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pulvirenti, M.

    2014-12-09

    In this contribution I consider the problem of the validity of the Boltzmann equation for a system of hard spheres in the Boltzmann-Grad limit. I briefly review the results available nowadays with a particular emphasis on the celebrated Lanford’s validity theorem. Finally I present some recent results, obtained in collaboration with S. Simonella, concerning a quantitative analysis of the propagation of chaos. More precisely we introduce a quantity (the correlation error) measuring how close a j-particle rescaled correlation function at time t (sufficiently small) is far from the full statistical independence. Roughly speaking, a correlation error of order k, measuresmore » (in the context of the BBKGY hierarchy) the event in which k tagged particles form a recolliding group.« less

  11. Stratospheric Assimilation of Chemical Tracer Observations Using a Kalman Filter. Pt. 2; Chi-Square Validated Results and Analysis of Variance and Correlation Dynamics

    NASA Technical Reports Server (NTRS)

    Menard, Richard; Chang, Lang-Ping

    1998-01-01

    A Kalman filter system designed for the assimilation of limb-sounding observations of stratospheric chemical tracers, which has four tunable covariance parameters, was developed in Part I (Menard et al. 1998) The assimilation results of CH4 observations from the Cryogenic Limb Array Etalon Sounder instrument (CLAES) and the Halogen Observation Experiment instrument (HALOE) on board of the Upper Atmosphere Research Satellite are described in this paper. A robust (chi)(sup 2) criterion, which provides a statistical validation of the forecast and observational error covariances, was used to estimate the tunable variance parameters of the system. In particular, an estimate of the model error variance was obtained. The effect of model error on the forecast error variance became critical after only three days of assimilation of CLAES observations, although it took 14 days of forecast to double the initial error variance. We further found that the model error due to numerical discretization as arising in the standard Kalman filter algorithm, is comparable in size to the physical model error due to wind and transport modeling errors together. Separate assimilations of CLAES and HALOE observations were compared to validate the state estimate away from the observed locations. A wave-breaking event that took place several thousands of kilometers away from the HALOE observation locations was well captured by the Kalman filter due to highly anisotropic forecast error correlations. The forecast error correlation in the assimilation of the CLAES observations was found to have a structure similar to that in pure forecast mode except for smaller length scales. Finally, we have conducted an analysis of the variance and correlation dynamics to determine their relative importance in chemical tracer assimilation problems. Results show that the optimality of a tracer assimilation system depends, for the most part, on having flow-dependent error correlation rather than on evolving the error variance.

  12. An Empirical State Error Covariance Matrix for Batch State Estimation

    NASA Technical Reports Server (NTRS)

    Frisbee, Joseph H., Jr.

    2011-01-01

    State estimation techniques serve effectively to provide mean state estimates. However, the state error covariance matrices provided as part of these techniques suffer from some degree of lack of confidence in their ability to adequately describe the uncertainty in the estimated states. A specific problem with the traditional form of state error covariance matrices is that they represent only a mapping of the assumed observation error characteristics into the state space. Any errors that arise from other sources (environment modeling, precision, etc.) are not directly represented in a traditional, theoretical state error covariance matrix. Consider that an actual observation contains only measurement error and that an estimated observation contains all other errors, known and unknown. It then follows that a measurement residual (the difference between expected and observed measurements) contains all errors for that measurement. Therefore, a direct and appropriate inclusion of the actual measurement residuals in the state error covariance matrix will result in an empirical state error covariance matrix. This empirical state error covariance matrix will fully account for the error in the state estimate. By way of a literal reinterpretation of the equations involved in the weighted least squares estimation algorithm, it is possible to arrive at an appropriate, and formally correct, empirical state error covariance matrix. The first specific step of the method is to use the average form of the weighted measurement residual variance performance index rather than its usual total weighted residual form. Next it is helpful to interpret the solution to the normal equations as the average of a collection of sample vectors drawn from a hypothetical parent population. From here, using a standard statistical analysis approach, it directly follows as to how to determine the standard empirical state error covariance matrix. This matrix will contain the total uncertainty in the state estimate, regardless as to the source of the uncertainty. Also, in its most straight forward form, the technique only requires supplemental calculations to be added to existing batch algorithms. The generation of this direct, empirical form of the state error covariance matrix is independent of the dimensionality of the observations. Mixed degrees of freedom for an observation set are allowed. As is the case with any simple, empirical sample variance problems, the presented approach offers an opportunity (at least in the case of weighted least squares) to investigate confidence interval estimates for the error covariance matrix elements. The diagonal or variance terms of the error covariance matrix have a particularly simple form to associate with either a multiple degree of freedom chi-square distribution (more approximate) or with a gamma distribution (less approximate). The off diagonal or covariance terms of the matrix are less clear in their statistical behavior. However, the off diagonal covariance matrix elements still lend themselves to standard confidence interval error analysis. The distributional forms associated with the off diagonal terms are more varied and, perhaps, more approximate than those associated with the diagonal terms. Using a simple weighted least squares sample problem, results obtained through use of the proposed technique are presented. The example consists of a simple, two observer, triangulation problem with range only measurements. Variations of this problem reflect an ideal case (perfect knowledge of the range errors) and a mismodeled case (incorrect knowledge of the range errors).

  13. Multivariate statistical analysis of a high rate biofilm process treating kraft mill bleach plant effluent.

    PubMed

    Goode, C; LeRoy, J; Allen, D G

    2007-01-01

    This study reports on a multivariate analysis of the moving bed biofilm reactor (MBBR) wastewater treatment system at a Canadian pulp mill. The modelling approach involved a data overview by principal component analysis (PCA) followed by partial least squares (PLS) modelling with the objective of explaining and predicting changes in the BOD output of the reactor. Over two years of data with 87 process measurements were used to build the models. Variables were collected from the MBBR control scheme as well as upstream in the bleach plant and in digestion. To account for process dynamics, a variable lagging approach was used for variables with significant temporal correlations. It was found that wood type pulped at the mill was a significant variable governing reactor performance. Other important variables included flow parameters, faults in the temperature or pH control of the reactor, and some potential indirect indicators of biomass activity (residual nitrogen and pH out). The most predictive model was found to have an RMSEP value of 606 kgBOD/d, representing a 14.5% average error. This was a good fit, given the measurement error of the BOD test. Overall, the statistical approach was effective in describing and predicting MBBR treatment performance.

  14. Qualitative fusion technique based on information poor system and its application to factor analysis for vibration of rolling bearings

    NASA Astrophysics Data System (ADS)

    Xia, Xintao; Wang, Zhongyu

    2008-10-01

    For some methods of stability analysis of a system using statistics, it is difficult to resolve the problems of unknown probability distribution and small sample. Therefore, a novel method is proposed in this paper to resolve these problems. This method is independent of probability distribution, and is useful for small sample systems. After rearrangement of the original data series, the order difference and two polynomial membership functions are introduced to estimate the true value, the lower bound and the supper bound of the system using fuzzy-set theory. Then empirical distribution function is investigated to ensure confidence level above 95%, and the degree of similarity is presented to evaluate stability of the system. Cases of computer simulation investigate stable systems with various probability distribution, unstable systems with linear systematic errors and periodic systematic errors and some mixed systems. The method of analysis for systematic stability is approved.

  15. Generating a Magellanic star cluster catalog with ASteCA

    NASA Astrophysics Data System (ADS)

    Perren, G. I.; Piatti, A. E.; Vázquez, R. A.

    2016-08-01

    An increasing number of software tools have been employed in the recent years for the automated or semi-automated processing of astronomical data. The main advantages of using these tools over a standard by-eye analysis include: speed (particularly for large databases), homogeneity, reproducibility, and precision. At the same time, they enable a statistically correct study of the uncertainties associated with the analysis, in contrast with manually set errors, or the still widespread practice of simply not assigning errors. We present a catalog comprising 210 star clusters located in the Large and Small Magellanic Clouds, observed with Washington photometry. Their fundamental parameters were estimated through an homogeneous, automatized and completely unassisted process, via the Automated Stellar Cluster Analysis package ( ASteCA). Our results are compared with two types of studies on these clusters: one where the photometry is the same, and another where the photometric system is different than that employed by ASteCA.

  16. [Medication error management climate and perception for system use according to construction of medication error prevention system].

    PubMed

    Kim, Myoung Soo

    2012-08-01

    The purpose of this cross-sectional study was to examine current status of IT-based medication error prevention system construction and the relationships among system construction, medication error management climate and perception for system use. The participants were 124 patient safety chief managers working for 124 hospitals with over 300 beds in Korea. The characteristics of the participants, construction status and perception of systems (electric pharmacopoeia, electric drug dosage calculation system, computer-based patient safety reporting and bar-code system) and medication error management climate were measured in this study. The data were collected between June and August 2011. Descriptive statistics, partial Pearson correlation and MANCOVA were used for data analysis. Electric pharmacopoeia were constructed in 67.7% of participating hospitals, computer-based patient safety reporting systems were constructed in 50.8%, electric drug dosage calculation systems were in use in 32.3%. Bar-code systems showed up the lowest construction rate at 16.1% of Korean hospitals. Higher rates of construction of IT-based medication error prevention systems resulted in greater safety and a more positive error management climate prevailed. The supportive strategies for improving perception for use of IT-based systems would add to system construction, and positive error management climate would be more easily promoted.

  17. Uncertainty analysis of the Operational Simplified Surface Energy Balance (SSEBop) model at multiple flux tower sites

    USGS Publications Warehouse

    Chen, Mingshi; Senay, Gabriel B.; Singh, Ramesh K.; Verdin, James P.

    2016-01-01

    Evapotranspiration (ET) is an important component of the water cycle – ET from the land surface returns approximately 60% of the global precipitation back to the atmosphere. ET also plays an important role in energy transport among the biosphere, atmosphere, and hydrosphere. Current regional to global and daily to annual ET estimation relies mainly on surface energy balance (SEB) ET models or statistical and empirical methods driven by remote sensing data and various climatological databases. These models have uncertainties due to inevitable input errors, poorly defined parameters, and inadequate model structures. The eddy covariance measurements on water, energy, and carbon fluxes at the AmeriFlux tower sites provide an opportunity to assess the ET modeling uncertainties. In this study, we focused on uncertainty analysis of the Operational Simplified Surface Energy Balance (SSEBop) model for ET estimation at multiple AmeriFlux tower sites with diverse land cover characteristics and climatic conditions. The 8-day composite 1-km MODerate resolution Imaging Spectroradiometer (MODIS) land surface temperature (LST) was used as input land surface temperature for the SSEBop algorithms. The other input data were taken from the AmeriFlux database. Results of statistical analysis indicated that the SSEBop model performed well in estimating ET with an R2 of 0.86 between estimated ET and eddy covariance measurements at 42 AmeriFlux tower sites during 2001–2007. It was encouraging to see that the best performance was observed for croplands, where R2 was 0.92 with a root mean square error of 13 mm/month. The uncertainties or random errors from input variables and parameters of the SSEBop model led to monthly ET estimates with relative errors less than 20% across multiple flux tower sites distributed across different biomes. This uncertainty of the SSEBop model lies within the error range of other SEB models, suggesting systematic error or bias of the SSEBop model is within the normal range. This finding implies that the simplified parameterization of the SSEBop model did not significantly affect the accuracy of the ET estimate while increasing the ease of model setup for operational applications. The sensitivity analysis indicated that the SSEBop model is most sensitive to input variables, land surface temperature (LST) and reference ET (ETo); and parameters, differential temperature (dT), and maximum ET scalar (Kmax), particularly during the non-growing season and in dry areas. In summary, the uncertainty assessment verifies that the SSEBop model is a reliable and robust method for large-area ET estimation. The SSEBop model estimates can be further improved by reducing errors in two input variables (ETo and LST) and two key parameters (Kmax and dT).

  18. Advances in the meta-analysis of heterogeneous clinical trials I: The inverse variance heterogeneity model.

    PubMed

    Doi, Suhail A R; Barendregt, Jan J; Khan, Shahjahan; Thalib, Lukman; Williams, Gail M

    2015-11-01

    This article examines an improved alternative to the random effects (RE) model for meta-analysis of heterogeneous studies. It is shown that the known issues of underestimation of the statistical error and spuriously overconfident estimates with the RE model can be resolved by the use of an estimator under the fixed effect model assumption with a quasi-likelihood based variance structure - the IVhet model. Extensive simulations confirm that this estimator retains a correct coverage probability and a lower observed variance than the RE model estimator, regardless of heterogeneity. When the proposed IVhet method is applied to the controversial meta-analysis of intravenous magnesium for the prevention of mortality after myocardial infarction, the pooled OR is 1.01 (95% CI 0.71-1.46) which not only favors the larger studies but also indicates more uncertainty around the point estimate. In comparison, under the RE model the pooled OR is 0.71 (95% CI 0.57-0.89) which, given the simulation results, reflects underestimation of the statistical error. Given the compelling evidence generated, we recommend that the IVhet model replace both the FE and RE models. To facilitate this, it has been implemented into free meta-analysis software called MetaXL which can be downloaded from www.epigear.com. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. Use of error grid analysis to evaluate acceptability of a point of care prothrombin time meter.

    PubMed

    Petersen, John R; Vonmarensdorf, Hans M; Weiss, Heidi L; Elghetany, M Tarek

    2010-02-01

    Statistical methods (linear regression, correlation analysis, etc.) are frequently employed in comparing methods in the central laboratory (CL). Assessing acceptability of point of care testing (POCT) equipment, however, is more difficult because statistically significant biases may not have an impact on clinical care. We showed how error grid (EG) analysis can be used to evaluate POCT PT INR with the CL. We compared results from 103 patients seen in an anti-coagulation clinic that were on Coumadin maintenance therapy using fingerstick samples for POCT (Roche CoaguChek XS and S) and citrated venous blood samples for CL (Stago STAR). To compare clinical acceptability of results we developed an EG with zones A, B, C and D. Using 2nd order polynomial equation analysis, POCT results highly correlate with the CL for CoaguChek XS (R(2)=0. 955) and CoaguChek S (R(2)=0. 93), respectively but does not indicate if POCT results are clinically interchangeable with the CL. Using EG it is readily apparent which levels can be considered clinically identical to the CL despite analytical bias. We have demonstrated the usefulness of EG in determining acceptability of POCT PT INR testing and how it can be used to determine cut-offs where differences in POCT results may impact clinical care. Copyright 2009 Elsevier B.V. All rights reserved.

  20. Robustness of S1 statistic with Hodges-Lehmann for skewed distributions

    NASA Astrophysics Data System (ADS)

    Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping

    2016-10-01

    Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.

  1. Counteracting structural errors in ensemble forecast of influenza outbreaks.

    PubMed

    Pei, Sen; Shaman, Jeffrey

    2017-10-13

    For influenza forecasts generated using dynamical models, forecast inaccuracy is partly attributable to the nonlinear growth of error. As a consequence, quantification of the nonlinear error structure in current forecast models is needed so that this growth can be corrected and forecast skill improved. Here, we inspect the error growth of a compartmental influenza model and find that a robust error structure arises naturally from the nonlinear model dynamics. By counteracting these structural errors, diagnosed using error breeding, we develop a new forecast approach that combines dynamical error correction and statistical filtering techniques. In retrospective forecasts of historical influenza outbreaks for 95 US cities from 2003 to 2014, overall forecast accuracy for outbreak peak timing, peak intensity and attack rate, are substantially improved for predicted lead times up to 10 weeks. This error growth correction method can be generalized to improve the forecast accuracy of other infectious disease dynamical models.Inaccuracy of influenza forecasts based on dynamical models is partly due to nonlinear error growth. Here the authors address the error structure of a compartmental influenza model, and develop a new improved forecast approach combining dynamical error correction and statistical filtering techniques.

  2. A simulation of GPS and differential GPS sensors

    NASA Technical Reports Server (NTRS)

    Rankin, James M.

    1993-01-01

    The Global Positioning System (GPS) is a revolutionary advance in navigation. Users can determine latitude, longitude, and altitude by receiving range information from at least four satellites. The statistical accuracy of the user's position is directly proportional to the statistical accuracy of the range measurement. Range errors are caused by clock errors, ephemeris errors, atmospheric delays, multipath errors, and receiver noise. Selective Availability, which the military uses to intentionally degrade accuracy for non-authorized users, is a major error source. The proportionality constant relating position errors to range errors is the Dilution of Precision (DOP) which is a function of the satellite geometry. Receivers separated by relatively short distances have the same satellite and atmospheric errors. Differential GPS (DGPS) removes these errors by transmitting pseudorange corrections from a fixed receiver to a mobile receiver. The corrected pseudorange at the moving receiver is now corrupted only by errors from the receiver clock, multipath, and measurement noise. This paper describes a software package that models position errors for various GPS and DGPS systems. The error model is used in the Real-Time Simulator and Cockpit Technology workstation simulations at NASA-LaRC. The GPS/DGPS sensor can simulate enroute navigation, instrument approaches, or on-airport navigation.

  3. [Evaluation on application of China Disease Prevention and Control Information System of Hydatid Disease Ⅰ Current status at the provincial level].

    PubMed

    Zhi-Hua, Zhang; Qing, Yu; Tian, Tian; Wei-Ping, Wu; Ning, Xiao

    2016-03-31

    To evaluate the application status of China Disease Prevention and Control Information System of Hydatid Disease, in which questions existed are summarized in order to promote the system update. A questionnaire was designed and distributed to Inner Mongolia, Sichuan, Tibet, Gansu, Qinghai, Ningxia, Xinjiang and Xinjiang Production and Construction Corps to evaluate the application status of China Disease Prevention and Control Information System of Hydatid Disease assistant with telephone. The recovery rate of questionnaires was 87.5%. The statistics of closed questions showed that national application rate of the China Disease Prevention and Control Information System of Hydatid Disease was 100%, of which 15.3% were low frequency users, 57.1% believed the system was necessary, 28.6% considered it was dispensable, and 14.3% believed that it was totally unnecessary. The statistics of open-ended questions indicated that 6 endemic regions suggested to increase the guidance and training, while 4 endemic regions had opinions on sharing the information of the national infectious disease reporting systems and hydatid disease prevention and control information system, and the opinions on turning monthly report to quarterly report, and increasing statistics and analysis module, and 3 endemic regions deemed that the system had logic errors and defects. The problems of the system are mainly focused on the existence of systemic deficiencies and logic errors, lacking of statistical parameters and corresponding analysis function module, and lacking of the guidance and training, which limits the use of the system. Therefore, these problems should be resolved.

  4. A retrospective analysis of eye conditions among children attending St. John Eye Hospital, Hebron, Palestine.

    PubMed

    Banayot, Riyad G

    2016-04-05

    Eye diseases are important causes of medical consultations, with the spectrum varying in different regions. This hospital-based descriptive study aimed to determine the profile of childhood eye conditions at St. John tertiary Eye hospital serving in Hebron, Palestine. Files of all new patients less than 16 years old who presented to St. John Eye Hospital-Hebron, Palestine between January 2013 and December 2013 were retrospectively reviewed. Age at presentation, sex, and clinical diagnosis were extracted from medical records. Data were stored and analyzed using Wizard data analysis version 1.6.0 by Evan Miller. The Chi square test was used to compare variables and a p value of less than 0.05 was considered statistically significant. We evaluated the records of 1102 patients, with a female: male ratio of 1:1.1. Patients aged 0-5 years old were the largest group (40.2%). Refractive errors were the most common ocular disorders seen (31.6%), followed by conjunctival diseases (23.7%) and strabismus and amblyopia (13.8%). Refractive errors were recorded more frequently and statistically significant (p < 0.001) among (11-15) age group. Within the conjunctival diseases category, conjunctivitis and dry eyes was more prominent and statistically significant (p < 0.001) among the 6-10 year old age group. Within the strabismus and amblyopia category, convergent strabismus was more common and statistically significant among the youngest age group (0-5 years old). The most common causes of ocular morbidity are largely treatable or preventable. These results suggest the need for awareness campaigns and early intervention programs.

  5. Dark Energy Survey Year 1 Results: Multi-Probe Methodology and Simulated Likelihood Analyses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krause, E.; et al.

    We present the methodology for and detail the implementation of the Dark Energy Survey (DES) 3x2pt DES Year 1 (Y1) analysis, which combines configuration-space two-point statistics from three different cosmological probes: cosmic shear, galaxy-galaxy lensing, and galaxy clustering, using data from the first year of DES observations. We have developed two independent modeling pipelines and describe the code validation process. We derive expressions for analytical real-space multi-probe covariances, and describe their validation with numerical simulations. We stress-test the inference pipelines in simulated likelihood analyses that vary 6-7 cosmology parameters plus 20 nuisance parameters and precisely resemble the analysis to be presented in the DES 3x2pt analysis paper, using a variety of simulated input data vectors with varying assumptions. We find that any disagreement between pipelines leads to changes in assigned likelihoodmore » $$\\Delta \\chi^2 \\le 0.045$$ with respect to the statistical error of the DES Y1 data vector. We also find that angular binning and survey mask do not impact our analytic covariance at a significant level. We determine lower bounds on scales used for analysis of galaxy clustering (8 Mpc$$~h^{-1}$$) and galaxy-galaxy lensing (12 Mpc$$~h^{-1}$$) such that the impact of modeling uncertainties in the non-linear regime is well below statistical errors, and show that our analysis choices are robust against a variety of systematics. These tests demonstrate that we have a robust analysis pipeline that yields unbiased cosmological parameter inferences for the flagship 3x2pt DES Y1 analysis. We emphasize that the level of independent code development and subsequent code comparison as demonstrated in this paper is necessary to produce credible constraints from increasingly complex multi-probe analyses of current data.« less

  6. The effect of normalization of Partial Directed Coherence on the statistical assessment of connectivity patterns: a simulation study.

    PubMed

    Toppi, J; Petti, M; Vecchiato, G; Cincotti, F; Salinari, S; Mattia, D; Babiloni, F; Astolfi, L

    2013-01-01

    Partial Directed Coherence (PDC) is a spectral multivariate estimator for effective connectivity, relying on the concept of Granger causality. Even if its original definition derived directly from information theory, two modifies were introduced in order to provide better physiological interpretations of the estimated networks: i) normalization of the estimator according to rows, ii) squared transformation. In the present paper we investigated the effect of PDC normalization on the performances achieved by applying the statistical validation process on investigated connectivity patterns under different conditions of Signal to Noise ratio (SNR) and amount of data available for the analysis. Results of the statistical analysis revealed an effect of PDC normalization only on the percentages of type I and type II errors occurred by using Shuffling procedure for the assessment of connectivity patterns. No effects of the PDC formulation resulted on the performances achieved during the validation process executed instead by means of Asymptotic Statistic approach. Moreover, the percentages of both false positives and false negatives committed by Asymptotic Statistic are always lower than those achieved by Shuffling procedure for each type of normalization.

  7. The deficit of joint position sense in the chronic unstable ankle as measured by inversion angle replication error.

    PubMed

    Nakasa, Tomoyuki; Fukuhara, Kohei; Adachi, Nobuo; Ochi, Mitsuo

    2008-05-01

    Functional instability is defined as a repeated ankle inversion sprain and a giving way sensation. Previous studies have described the damage of sensori-motor control in ankle sprain as being a possible cause of functional instability. The aim of this study was to evaluate the inversion angle replication errors in patients with functional instability after ankle sprain. The difference between the index angle and replication angle was measured in 12 subjects with functional instability, with the aim of evaluating the replication error. As a control group, the replication errors of 17 healthy volunteers were investigated. The side-to-side differences of the replication errors were compared between both the groups, and the relationship between the side-to-side differences of the replication errors and the mechanical instability were statistically analyzed in the unstable group. The side-to-side difference of the replication errors was 1.0 +/- 0.7 degrees in the unstable group and 0.2 +/- 0.7 degrees in the control group. There was a statistically significant difference between both the groups. The side-to-side differences of the replication errors in the unstable group did not statistically correlate to the anterior talar translation and talar tilt. The patients with functional instability had the deficit of joint position sense in comparison with healthy volunteers. The replication error did not correlate to the mechanical instability. The patients with functional instability should be treated appropriately in spite of having less mechanical instability.

  8. The Brera Multiscale Wavelet ROSAT HRI Source Catalog. I. The Algorithm

    NASA Astrophysics Data System (ADS)

    Lazzati, Davide; Campana, Sergio; Rosati, Piero; Panzera, Maria Rosa; Tagliaferri, Gianpiero

    1999-10-01

    We present a new detection algorithm based on the wavelet transform for the analysis of high-energy astronomical images. The wavelet transform, because of its multiscale structure, is suited to the optimal detection of pointlike as well as extended sources, regardless of any loss of resolution with the off-axis angle. Sources are detected as significant enhancements in the wavelet space, after the subtraction of the nonflat components of the background. Detection thresholds are computed through Monte Carlo simulations in order to establish the expected number of spurious sources per field. The source characterization is performed through a multisource fitting in the wavelet space. The procedure is designed to correctly deal with very crowded fields, allowing for the simultaneous characterization of nearby sources. To obtain a fast and reliable estimate of the source parameters and related errors, we apply a novel decimation technique that, taking into account the correlation properties of the wavelet transform, extracts a subset of almost independent coefficients. We test the performance of this algorithm on synthetic fields, analyzing with particular care the characterization of sources in poor background situations, where the assumption of Gaussian statistics does not hold. In these cases, for which standard wavelet algorithms generally provide underestimated errors, we infer errors through a procedure that relies on robust basic statistics. Our algorithm is well suited to the analysis of images taken with the new generation of X-ray instruments equipped with CCD technology, which will produce images with very low background and/or high source density.

  9. Statistical plant set estimation using Schroeder-phased multisinusoidal input design

    NASA Technical Reports Server (NTRS)

    Bayard, D. S.

    1992-01-01

    A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.

  10. Baseline estimation in flame's spectra by using neural networks and robust statistics

    NASA Astrophysics Data System (ADS)

    Garces, Hugo; Arias, Luis; Rojas, Alejandro

    2014-09-01

    This work presents a baseline estimation method in flame spectra based on artificial intelligence structure as a neural network, combining robust statistics with multivariate analysis to automatically discriminate measured wavelengths belonging to continuous feature for model adaptation, surpassing restriction of measuring target baseline for training. The main contributions of this paper are: to analyze a flame spectra database computing Jolliffe statistics from Principal Components Analysis detecting wavelengths not correlated with most of the measured data corresponding to baseline; to systematically determine the optimal number of neurons in hidden layers based on Akaike's Final Prediction Error; to estimate baseline in full wavelength range sampling measured spectra; and to train an artificial intelligence structure as a Neural Network which allows to generalize the relation between measured and baseline spectra. The main application of our research is to compute total radiation with baseline information, allowing to diagnose combustion process state for optimization in early stages.

  11. A procedure for the significance testing of unmodeled errors in GNSS observations

    NASA Astrophysics Data System (ADS)

    Li, Bofeng; Zhang, Zhetao; Shen, Yunzhong; Yang, Ling

    2018-01-01

    It is a crucial task to establish a precise mathematical model for global navigation satellite system (GNSS) observations in precise positioning. Due to the spatiotemporal complexity of, and limited knowledge on, systematic errors in GNSS observations, some residual systematic errors would inevitably remain even after corrected with empirical model and parameterization. These residual systematic errors are referred to as unmodeled errors. However, most of the existing studies mainly focus on handling the systematic errors that can be properly modeled and then simply ignore the unmodeled errors that may actually exist. To further improve the accuracy and reliability of GNSS applications, such unmodeled errors must be handled especially when they are significant. Therefore, a very first question is how to statistically validate the significance of unmodeled errors. In this research, we will propose a procedure to examine the significance of these unmodeled errors by the combined use of the hypothesis tests. With this testing procedure, three components of unmodeled errors, i.e., the nonstationary signal, stationary signal and white noise, are identified. The procedure is tested by using simulated data and real BeiDou datasets with varying error sources. The results show that the unmodeled errors can be discriminated by our procedure with approximately 90% confidence. The efficiency of the proposed procedure is further reassured by applying the time-domain Allan variance analysis and frequency-domain fast Fourier transform. In summary, the spatiotemporally correlated unmodeled errors are commonly existent in GNSS observations and mainly governed by the residual atmospheric biases and multipath. Their patterns may also be impacted by the receiver.

  12. Phenotypic characterization of X-linked retinoschisis: Clinical, electroretinography, and optical coherence tomography variables

    PubMed Central

    Neriyanuri, Srividya; Dhandayuthapani, Sudha; Arunachalam, Jayamuruga Pandian; Raman, Rajiv

    2016-01-01

    Aims: To study the phenotypic characteristics of X-linked retinoschisis (XLRS) and report the clinical, electroretinogram (ERG), and optical coherence tomography (OCT) variables in Indian eyes. Design: A retrospective study. Materials and Methods: Medical records of 21 patients with retinoschisis who were genetically confirmed to have RS1 mutation were reviewed. The phenotype characterization included the age of onset, best-corrected visual acuity, refractive error, fundus findings, OCT, and ERG. Statistical Analysis Used: Data from both the eyes were used for analysis. A P < 0.05 was set as statistical significance. Data were not normally distributed (P < 0.05, Shapiro wilk); hence, nonparametric tests were used for statistical analysis. Results: All were males whose mean age of presentation was 9 years. Visual acuity was moderately impaired (median 0.6 logMAR, interquartile range: 0.47, 1) in these eyes with a hyperopic refractive error of median +1.75 Ds (interquartile range: +0.50 Ds, +4.25 Ds). About 54.7% of the eyes had both foveal and peripheral schisis, isolated foveal schisis was seen in 28.5% of the eyes, and schisis with retinal detachment was seen in 16.6% of the eyes. The inner nuclear layer was found to be commonly involved in the schisis, followed by outer nuclear and plexiform layers as evident on OCT. On ERG, a- and b-wave amplitudes were significantly reduced in eyes with foveal and peripheral schisis when compared to the eyes with only foveal schisis (P < 0.05). Conclusions: XLRS has phenotypic heterogeneity as evident on OCT, ERG, and clinical findings. PMID:27609164

  13. Component Analysis of Errors on PERSIANN Precipitation Estimates over Urmia Lake Basin, IRAN

    NASA Astrophysics Data System (ADS)

    Ghajarnia, N.; Daneshkar Arasteh, P.; Liaghat, A. M.; Araghinejad, S.

    2016-12-01

    In this study, PERSIANN daily dataset is evaluated from 2000 to 2011 in 69 pixels over Urmia Lake basin in northwest of Iran. Different analytical approaches and indexes are used to examine PERSIANN precision in detection and estimation of rainfall rate. The residuals are decomposed into Hit, Miss and FA estimation biases while continues decomposition of systematic and random error components are also analyzed seasonally and categorically. New interpretation of estimation accuracy named "reliability on PERSIANN estimations" is introduced while the changing manners of existing categorical/statistical measures and error components are also seasonally analyzed over different rainfall rate categories. This study yields new insights into the nature of PERSIANN errors over Urmia lake basin as a semi-arid region in the middle-east, including the followings: - The analyzed contingency table indexes indicate better detection precision during spring and fall. - A relatively constant level of error is generally observed among different categories. The range of precipitation estimates at different rainfall rate categories is nearly invariant as a sign for the existence of systematic error. - Low level of reliability is observed on PERSIANN estimations at different categories which are mostly associated with high level of FA error. However, it is observed that as the rate of precipitation increase, the ability and precision of PERSIANN in rainfall detection also increases. - The systematic and random error decomposition in this area shows that PERSIANN has more difficulty in modeling the system and pattern of rainfall rather than to have bias due to rainfall uncertainties. The level of systematic error also considerably increases in heavier rainfalls. It is also important to note that PERSIANN error characteristics at each season varies due to the condition and rainfall patterns of that season which shows the necessity of seasonally different approach for the calibration of this product. Overall, we believe that different error component's analysis performed in this study, can substantially help any further local studies for post-calibration and bias reduction of PERSIANN estimations.

  14. An RSM Study of the Effects of Simulation Work and Metamodel Specification on the Statistical Quality of Metamodel Estimates

    DTIC Science & Technology

    1994-03-01

    optimize, and perform "what-if" analysis on a complicated simulation model of the greenhouse effect . Regression metamodels were applied to several modules of...the large integrated assessment model of the greenhouse effect . In this study, the metamodels gave "acceptable forecast errors" and were shown to

  15. Rocks: A Concrete Activity That Introduces Normal Distribution, Sampling Error, Central Limit Theorem and True Score Theory

    ERIC Educational Resources Information Center

    Van Duzer, Eric

    2011-01-01

    This report introduces a short, hands-on activity that addresses a key challenge in teaching quantitative methods to students who lack confidence or experience with statistical analysis. Used near the beginning of the course, this activity helps students develop an intuitive insight regarding a number of abstract concepts which are key to…

  16. Determination of the refractive index of dehydrated cells by means of digital holographic microscopy

    NASA Astrophysics Data System (ADS)

    Belashov, A. V.; Zhikhoreva, A. A.; Bespalov, V. G.; Vasyutinskii, O. S.; Zhilinskaya, N. T.; Novik, V. I.; Semenova, I. V.

    2017-10-01

    Spatial distributions of the integral refractive index in dehydrated cells of human oral cavity epithelium are obtained by means of digital holographic microscopy, and mean refractive index of the cells is determined. The statistical analysis of the data obtained is carried out, and absolute errors of the method are estimated for different experimental conditions.

  17. Clinical outcomes of Transepithelial photorefractive keratectomy to treat low to moderate myopic astigmatism.

    PubMed

    Xi, Lei; Zhang, Chen; He, Yanling

    2018-05-09

    To evaluate the refractive and visual outcomes of Transepithelial photorefractive keratectomy (TransPRK) in the treatment of low to moderate myopic astigmatism. This retrospective study enrolled a total of 47 eyes that had undergone Transepithelial photorefractive keratectomy. Preoperative cylinder diopters ranged from - 0.75D to - 2.25D (mean - 1.11 ± 0.40D), and the sphere was between - 1.50D to - 5.75D. Visual outcomes and vector analysis of astigmatism that included error ratio (ER), correction ratio (CR), error of magnitude (EM) and error of angle (EA) were evaluated. At 6 months after TransPRK, all eyes had an uncorrected distance visual acuity of 20/20 or better, no eyes lost ≥2 lines of corrected distant visual acuity (CDVA), and 93.6% had residual refractive cylinder within ±0.50D of intended correction. On vector analysis, the mean correction ratio for refractive cylinder was 1.03 ± 0.30. The mean error magnitude was - 0.04 ± 0.36. The mean error of angle was 0.44° ± 7.42°and 80.9% of eyes had axis shift within ±10°. The absolute astigmatic error of magnitude was statistically significantly correlated with the intended cylinder correction (r = 0.48, P < 0.01). TransPRK showed safe, effective and predictable results in the correction of low to moderate astigmatism and myopia.

  18. Uncertainty analysis technique for OMEGA Dante measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    May, M. J.; Widmann, K.; Sorce, C.

    2010-10-15

    The Dante is an 18 channel x-ray filtered diode array which records the spectrally and temporally resolved radiation flux from various targets (e.g., hohlraums, etc.) at x-ray energies between 50 eV and 10 keV. It is a main diagnostic installed on the OMEGA laser facility at the Laboratory for Laser Energetics, University of Rochester. The absolute flux is determined from the photometric calibration of the x-ray diodes, filters and mirrors, and an unfold algorithm. Understanding the errors on this absolute measurement is critical for understanding hohlraum energetic physics. We present a new method for quantifying the uncertainties on the determinedmore » flux using a Monte Carlo parameter variation technique. This technique combines the uncertainties in both the unfold algorithm and the error from the absolute calibration of each channel into a one sigma Gaussian error function. One thousand test voltage sets are created using these error functions and processed by the unfold algorithm to produce individual spectra and fluxes. Statistical methods are applied to the resultant set of fluxes to estimate error bars on the measurements.« less

  19. Uncertainty Analysis Technique for OMEGA Dante Measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    May, M J; Widmann, K; Sorce, C

    2010-05-07

    The Dante is an 18 channel X-ray filtered diode array which records the spectrally and temporally resolved radiation flux from various targets (e.g. hohlraums, etc.) at X-ray energies between 50 eV to 10 keV. It is a main diagnostics installed on the OMEGA laser facility at the Laboratory for Laser Energetics, University of Rochester. The absolute flux is determined from the photometric calibration of the X-ray diodes, filters and mirrors and an unfold algorithm. Understanding the errors on this absolute measurement is critical for understanding hohlraum energetic physics. We present a new method for quantifying the uncertainties on the determinedmore » flux using a Monte-Carlo parameter variation technique. This technique combines the uncertainties in both the unfold algorithm and the error from the absolute calibration of each channel into a one sigma Gaussian error function. One thousand test voltage sets are created using these error functions and processed by the unfold algorithm to produce individual spectra and fluxes. Statistical methods are applied to the resultant set of fluxes to estimate error bars on the measurements.« less

  20. Statistical methods for launch vehicle guidance, navigation, and control (GN&C) system design and analysis

    NASA Astrophysics Data System (ADS)

    Rose, Michael Benjamin

    A novel trajectory and attitude control and navigation analysis tool for powered ascent is developed. The tool is capable of rapid trade-space analysis and is designed to ultimately reduce turnaround time for launch vehicle design, mission planning, and redesign work. It is streamlined to quickly determine trajectory and attitude control dispersions, propellant dispersions, orbit insertion dispersions, and navigation errors and their sensitivities to sensor errors, actuator execution uncertainties, and random disturbances. The tool is developed by applying both Monte Carlo and linear covariance analysis techniques to a closed-loop, launch vehicle guidance, navigation, and control (GN&C) system. The nonlinear dynamics and flight GN&C software models of a closed-loop, six-degree-of-freedom (6-DOF), Monte Carlo simulation are formulated and developed. The nominal reference trajectory (NRT) for the proposed lunar ascent trajectory is defined and generated. The Monte Carlo truth models and GN&C algorithms are linearized about the NRT, the linear covariance equations are formulated, and the linear covariance simulation is developed. The performance of the launch vehicle GN&C system is evaluated using both Monte Carlo and linear covariance techniques and their trajectory and attitude control dispersion, propellant dispersion, orbit insertion dispersion, and navigation error results are validated and compared. Statistical results from linear covariance analysis are generally within 10% of Monte Carlo results, and in most cases the differences are less than 5%. This is an excellent result given the many complex nonlinearities that are embedded in the ascent GN&C problem. Moreover, the real value of this tool lies in its speed, where the linear covariance simulation is 1036.62 times faster than the Monte Carlo simulation. Although the application and results presented are for a lunar, single-stage-to-orbit (SSTO), ascent vehicle, the tools, techniques, and mathematical formulations that are discussed are applicable to ascent on Earth or other planets as well as other rocket-powered systems such as sounding rockets and ballistic missiles.

Top