Sample records for small statistical errors

  1. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

    PubMed

    Lin, Johnny; Bentler, Peter M

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.

  2. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis

    PubMed Central

    Lin, Johnny; Bentler, Peter M.

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne’s asymptotically distribution-free method and Satorra Bentler’s mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler’s statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby’s study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic. PMID:23144511

  3. SU-E-T-503: IMRT Optimization Using Monte Carlo Dose Engine: The Effect of Statistical Uncertainty.

    PubMed

    Tian, Z; Jia, X; Graves, Y; Uribe-Sanchez, A; Jiang, S

    2012-06-01

    With the development of ultra-fast GPU-based Monte Carlo (MC) dose engine, it becomes clinically realistic to compute the dose-deposition coefficients (DDC) for IMRT optimization using MC simulation. However, it is still time-consuming if we want to compute DDC with small statistical uncertainty. This work studies the effects of the statistical error in DDC matrix on IMRT optimization. The MC-computed DDC matrices are simulated here by adding statistical uncertainties at a desired level to the ones generated with a finite-size pencil beam algorithm. A statistical uncertainty model for MC dose calculation is employed. We adopt a penalty-based quadratic optimization model and gradient descent method to optimize fluence map and then recalculate the corresponding actual dose distribution using the noise-free DDC matrix. The impacts of DDC noise are assessed in terms of the deviation of the resulted dose distributions. We have also used a stochastic perturbation theory to theoretically estimate the statistical errors of dose distributions on a simplified optimization model. A head-and-neck case is used to investigate the perturbation to IMRT plan due to MC's statistical uncertainty. The relative errors of the final dose distributions of the optimized IMRT are found to be much smaller than those in the DDC matrix, which is consistent with our theoretical estimation. When history number is decreased from 108 to 106, the dose-volume-histograms are still very similar to the error-free DVHs while the error in DDC is about 3.8%. The results illustrate that the statistical errors in the DDC matrix have a relatively small effect on IMRT optimization in dose domain. This indicates we can use relatively small number of histories to obtain the DDC matrix with MC simulation within a reasonable amount of time, without considerably compromising the accuracy of the optimized treatment plan. This work is supported by Varian Medical Systems through a Master Research Agreement. © 2012 American Association of Physicists in Medicine.

  4. Ensemble Kalman filters for dynamical systems with unresolved turbulence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grooms, Ian, E-mail: grooms@cims.nyu.edu; Lee, Yoonsang; Majda, Andrew J.

    Ensemble Kalman filters are developed for turbulent dynamical systems where the forecast model does not resolve all the active scales of motion. Coarse-resolution models are intended to predict the large-scale part of the true dynamics, but observations invariably include contributions from both the resolved large scales and the unresolved small scales. The error due to the contribution of unresolved scales to the observations, called ‘representation’ or ‘representativeness’ error, is often included as part of the observation error, in addition to the raw measurement error, when estimating the large-scale part of the system. It is here shown how stochastic superparameterization (amore » multiscale method for subgridscale parameterization) can be used to provide estimates of the statistics of the unresolved scales. In addition, a new framework is developed wherein small-scale statistics can be used to estimate both the resolved and unresolved components of the solution. The one-dimensional test problem from dispersive wave turbulence used here is computationally tractable yet is particularly difficult for filtering because of the non-Gaussian extreme event statistics and substantial small scale turbulence: a shallow energy spectrum proportional to k{sup −5/6} (where k is the wavenumber) results in two-thirds of the climatological variance being carried by the unresolved small scales. Because the unresolved scales contain so much energy, filters that ignore the representation error fail utterly to provide meaningful estimates of the system state. Inclusion of a time-independent climatological estimate of the representation error in a standard framework leads to inaccurate estimates of the large-scale part of the signal; accurate estimates of the large scales are only achieved by using stochastic superparameterization to provide evolving, large-scale dependent predictions of the small-scale statistics. Again, because the unresolved scales contain so much energy, even an accurate estimate of the large-scale part of the system does not provide an accurate estimate of the true state. By providing simultaneous estimates of both the large- and small-scale parts of the solution, the new framework is able to provide accurate estimates of the true system state.« less

  5. Adaptive Error Estimation in Linearized Ocean General Circulation Models

    NASA Technical Reports Server (NTRS)

    Chechelnitsky, Michael Y.

    1999-01-01

    Data assimilation methods are routinely used in oceanography. The statistics of the model and measurement errors need to be specified a priori. This study addresses the problem of estimating model and measurement error statistics from observations. We start by testing innovation based methods of adaptive error estimation with low-dimensional models in the North Pacific (5-60 deg N, 132-252 deg E) to TOPEX/POSEIDON (TIP) sea level anomaly data, acoustic tomography data from the ATOC project, and the MIT General Circulation Model (GCM). A reduced state linear model that describes large scale internal (baroclinic) error dynamics is used. The methods are shown to be sensitive to the initial guess for the error statistics and the type of observations. A new off-line approach is developed, the covariance matching approach (CMA), where covariance matrices of model-data residuals are "matched" to their theoretical expectations using familiar least squares methods. This method uses observations directly instead of the innovations sequence and is shown to be related to the MT method and the method of Fu et al. (1993). Twin experiments using the same linearized MIT GCM suggest that altimetric data are ill-suited to the estimation of internal GCM errors, but that such estimates can in theory be obtained using acoustic data. The CMA is then applied to T/P sea level anomaly data and a linearization of a global GFDL GCM which uses two vertical modes. We show that the CMA method can be used with a global model and a global data set, and that the estimates of the error statistics are robust. We show that the fraction of the GCM-T/P residual variance explained by the model error is larger than that derived in Fukumori et al.(1999) with the method of Fu et al.(1993). Most of the model error is explained by the barotropic mode. However, we find that impact of the change in the error statistics on the data assimilation estimates is very small. This is explained by the large representation error, i.e. the dominance of the mesoscale eddies in the T/P signal, which are not part of the 21 by 1" GCM. Therefore, the impact of the observations on the assimilation is very small even after the adjustment of the error statistics. This work demonstrates that simult&neous estimation of the model and measurement error statistics for data assimilation with global ocean data sets and linearized GCMs is possible. However, the error covariance estimation problem is in general highly underdetermined, much more so than the state estimation problem. In other words there exist a very large number of statistical models that can be made consistent with the available data. Therefore, methods for obtaining quantitative error estimates, powerful though they may be, cannot replace physical insight. Used in the right context, as a tool for guiding the choice of a small number of model error parameters, covariance matching can be a useful addition to the repertory of tools available to oceanographers.

  6. Experiences from the testing of a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  7. Experience gained in testing a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift, and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  8. Interspecies scaling and prediction of human clearance: comparison of small- and macro-molecule drugs

    PubMed Central

    Huh, Yeamin; Smith, David E.; Feng, Meihau Rose

    2014-01-01

    Human clearance prediction for small- and macro-molecule drugs was evaluated and compared using various scaling methods and statistical analysis.Human clearance is generally well predicted using single or multiple species simple allometry for macro- and small-molecule drugs excreted renally.The prediction error is higher for hepatically eliminated small-molecules using single or multiple species simple allometry scaling, and it appears that the prediction error is mainly associated with drugs with low hepatic extraction ratio (Eh). The error in human clearance prediction for hepatically eliminated small-molecules was reduced using scaling methods with a correction of maximum life span (MLP) or brain weight (BRW).Human clearance of both small- and macro-molecule drugs is well predicted using the monkey liver blood flow method. Predictions using liver blood flow from other species did not work as well, especially for the small-molecule drugs. PMID:21892879

  9. Correcting for Optimistic Prediction in Small Data Sets

    PubMed Central

    Smith, Gordon C. S.; Seaman, Shaun R.; Wood, Angela M.; Royston, Patrick; White, Ian R.

    2014-01-01

    The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991–2003), Edinburgh (1999–2003), and Cambridge (1990–2006), as well as Scottish national pregnancy discharge data (2004–2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation. PMID:24966219

  10. The statistical significance of error probability as determined from decoding simulations for long codes

    NASA Technical Reports Server (NTRS)

    Massey, J. L.

    1976-01-01

    The very low error probability obtained with long error-correcting codes results in a very small number of observed errors in simulation studies of practical size and renders the usual confidence interval techniques inapplicable to the observed error probability. A natural extension of the notion of a 'confidence interval' is made and applied to such determinations of error probability by simulation. An example is included to show the surprisingly great significance of as few as two decoding errors in a very large number of decoding trials.

  11. An evaluation of the accuracy of small-area demographic estimates of population at risk and its effect on prevalence statistics

    PubMed Central

    2013-01-01

    Demographic estimates of population at risk often underpin epidemiologic research and public health surveillance efforts. In spite of their central importance to epidemiology and public-health practice, little previous attention has been paid to evaluating the magnitude of errors associated with such estimates or the sensitivity of epidemiologic statistics to these effects. In spite of the well-known observation that accuracy in demographic estimates declines as the size of the population to be estimated decreases, demographers continue to face pressure to produce estimates for increasingly fine-grained population characteristics at ever-smaller geographic scales. Unfortunately, little guidance on the magnitude of errors that can be expected in such estimates is currently available in the literature and available for consideration in small-area epidemiology. This paper attempts to fill this current gap by producing a Vintage 2010 set of single-year-of-age estimates for census tracts, then evaluating their accuracy and precision in light of the results of the 2010 Census. These estimates are produced and evaluated for 499 census tracts in New Mexico for single-years of age from 0 to 21 and for each sex individually. The error distributions associated with these estimates are characterized statistically using non-parametric statistics including the median and 2.5th and 97.5th percentiles. The impact of these errors are considered through simulations in which observed and estimated 2010 population counts are used as alternative denominators and simulated event counts are used to compute a realistic range fo prevalence values. The implications of the results of this study for small-area epidemiologic research in cancer and environmental health are considered. PMID:24359344

  12. Laser Velocimeter Measurements and Analysis in Turbulent Flows with Combustion. Part 2.

    DTIC Science & Technology

    1983-07-01

    sampling error for 63 this sample size. Mean velocities and turbulence intensi- ties were found to be statistically accurate to ± 1 % and 13%, respectively...Although the statist - ical error was found to be rather small (± 1 % for mean velo- cities and 13% for turbulence intensities), there can be additional...34Computational and Experimental Study of a Captive Annular Eddy," Journal of Fluid Mechanics, Vol. 28, pt. 1 , pp. 43-63, 12 April, 1967. 152 REFERENCES (con’d

  13. Visual Performance on the Small Letter Contrast Test: Effects of Aging, Low Luminance and Refractive Error

    DTIC Science & Technology

    2000-08-01

    luminance performance and aviation, many aviators develop ametropias refractive error having comparable effects on during their careers. We were... statistically (0.04 logMAR, the non-aviator group. Separate investigators at p=0.01), but not clinically significant (ə/2 line different research facilities... statistically significant (0.11 ± 0.1 logCS, t=4.0, sensitivity on the SLCT decreased for the aviator pɘ.001), yet there is significant overlap group at a

  14. Effect of Random Circuit Fabrication Errors on Small Signal Gain and Phase in Helix Traveling Wave Tubes

    NASA Astrophysics Data System (ADS)

    Pengvanich, P.; Chernin, D. P.; Lau, Y. Y.; Luginsland, J. W.; Gilgenbach, R. M.

    2007-11-01

    Motivated by the current interest in mm-wave and THz sources, which use miniature, difficult-to-fabricate circuit components, we evaluate the statistical effects of random fabrication errors on a helix traveling wave tube amplifier's small signal characteristics. The small signal theory is treated in a continuum model in which the electron beam is assumed to be monoenergetic, and axially symmetric about the helix axis. Perturbations that vary randomly along the beam axis are introduced in the dimensionless Pierce parameters b, the beam-wave velocity mismatch, C, the gain parameter, and d, the cold tube circuit loss. Our study shows, as expected, that perturbation in b dominates the other two. The extensive numerical data have been confirmed by our analytic theory. They show in particular that the standard deviation of the output phase is linearly proportional to standard deviation of the individual perturbations in b, C, and d. Simple formulas have been derived which yield the output phase variations in terms of the statistical random manufacturing errors. This work was supported by AFOSR and by ONR.

  15. Errors in causal inference: an organizational schema for systematic error and random error.

    PubMed

    Suzuki, Etsuji; Tsuda, Toshihide; Mitsuhashi, Toshiharu; Mansournia, Mohammad Ali; Yamamoto, Eiji

    2016-11-01

    To provide an organizational schema for systematic error and random error in estimating causal measures, aimed at clarifying the concept of errors from the perspective of causal inference. We propose to divide systematic error into structural error and analytic error. With regard to random error, our schema shows its four major sources: nondeterministic counterfactuals, sampling variability, a mechanism that generates exposure events and measurement variability. Structural error is defined from the perspective of counterfactual reasoning and divided into nonexchangeability bias (which comprises confounding bias and selection bias) and measurement bias. Directed acyclic graphs are useful to illustrate this kind of error. Nonexchangeability bias implies a lack of "exchangeability" between the selected exposed and unexposed groups. A lack of exchangeability is not a primary concern of measurement bias, justifying its separation from confounding bias and selection bias. Many forms of analytic errors result from the small-sample properties of the estimator used and vanish asymptotically. Analytic error also results from wrong (misspecified) statistical models and inappropriate statistical methods. Our organizational schema is helpful for understanding the relationship between systematic error and random error from a previously less investigated aspect, enabling us to better understand the relationship between accuracy, validity, and precision. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Systematic Biases in Parameter Estimation of Binary Black-Hole Mergers

    NASA Technical Reports Server (NTRS)

    Littenberg, Tyson B.; Baker, John G.; Buonanno, Alessandra; Kelly, Bernard J.

    2012-01-01

    Parameter estimation of binary-black-hole merger events in gravitational-wave data relies on matched filtering techniques, which, in turn, depend on accurate model waveforms. Here we characterize the systematic biases introduced in measuring astrophysical parameters of binary black holes by applying the currently most accurate effective-one-body templates to simulated data containing non-spinning numerical-relativity waveforms. For advanced ground-based detectors, we find that the systematic biases are well within the statistical error for realistic signal-to-noise ratios (SNR). These biases grow to be comparable to the statistical errors at high signal-to-noise ratios for ground-based instruments (SNR approximately 50) but never dominate the error budget. At the much larger signal-to-noise ratios expected for space-based detectors, these biases will become large compared to the statistical errors but are small enough (at most a few percent in the black-hole masses) that we expect they should not affect broad astrophysical conclusions that may be drawn from the data.

  17. Comparison of rate one-half, equivalent constraint length 24, binary convolutional codes for use with sequential decoding on the deep-space channel

    NASA Technical Reports Server (NTRS)

    Massey, J. L.

    1976-01-01

    Virtually all previously-suggested rate 1/2 binary convolutional codes with KE = 24 are compared. Their distance properties are given; and their performance, both in computation and in error probability, with sequential decoding on the deep-space channel is determined by simulation. Recommendations are made both for the choice of a specific KE = 24 code as well as for codes to be included in future coding standards for the deep-space channel. A new result given in this report is a method for determining the statistical significance of error probability data when the error probability is so small that it is not feasible to perform enough decoding simulations to obtain more than a very small number of decoding errors.

  18. Error floor behavior study of LDPC codes for concatenated codes design

    NASA Astrophysics Data System (ADS)

    Chen, Weigang; Yin, Liuguo; Lu, Jianhua

    2007-11-01

    Error floor behavior of low-density parity-check (LDPC) codes using quantized decoding algorithms is statistically studied with experimental results on a hardware evaluation platform. The results present the distribution of the residual errors after decoding failure and reveal that the number of residual error bits in a codeword is usually very small using quantized sum-product (SP) algorithm. Therefore, LDPC code may serve as the inner code in a concatenated coding system with a high code rate outer code and thus an ultra low error floor can be achieved. This conclusion is also verified by the experimental results.

  19. Impact of spot charge inaccuracies in IMPT treatments.

    PubMed

    Kraan, Aafke C; Depauw, Nicolas; Clasie, Ben; Giunta, Marina; Madden, Tom; Kooy, Hanne M

    2017-08-01

    Spot charge is one parameter of pencil-beam scanning dose delivery system whose accuracy is typically high but whose required value has not been investigated. In this work we quantify the dose impact of spot charge inaccuracies on the dose distribution in patients. Knowing the effect of charge errors is relevant for conventional proton machines, as well as for new generation proton machines, where ensuring accurate charge may be challenging. Through perturbation of spot charge in treatment plans for seven patients and a phantom, we evaluated the dose impact of absolute (up to 5× 10 6 protons) and relative (up to 30%) charge errors. We investigated the dependence on beam width by studying scenarios with small, medium and large beam sizes. Treatment plan statistics included the Γ passing rate, dose-volume-histograms and dose differences. The allowable absolute charge error for small spot plans was about 2× 10 6 protons. Larger limits would be allowed if larger spots were used. For relative errors, the maximum allowable error size for small, medium and large spots was about 13%, 8% and 6% for small, medium and large spots, respectively. Dose distributions turned out to be surprisingly robust against random spot charge perturbation. Our study suggests that ensuring spot charge errors as small as 1-2% as is commonly aimed at in conventional proton therapy machines, is clinically not strictly needed. © 2017 American Association of Physicists in Medicine.

  20. Quantitative evaluation of statistical errors in small-angle X-ray scattering measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sedlak, Steffen M.; Bruetzel, Linda K.; Lipfert, Jan

    A new model is proposed for the measurement errors incurred in typical small-angle X-ray scattering (SAXS) experiments, which takes into account the setup geometry and physics of the measurement process. The model accurately captures the experimentally determined errors from a large range of synchrotron and in-house anode-based measurements. Its most general formulation gives for the variance of the buffer-subtracted SAXS intensity σ 2(q) = [I(q) + const.]/(kq), whereI(q) is the scattering intensity as a function of the momentum transferq;kand const. are fitting parameters that are characteristic of the experimental setup. The model gives a concrete procedure for calculating realistic measurementmore » errors for simulated SAXS profiles. In addition, the results provide guidelines for optimizing SAXS measurements, which are in line with established procedures for SAXS experiments, and enable a quantitative evaluation of measurement errors.« less

  1. Error, Power, and Blind Sentinels: The Statistics of Seagrass Monitoring

    PubMed Central

    Schultz, Stewart T.; Kruschel, Claudia; Bakran-Petricioli, Tatjana; Petricioli, Donat

    2015-01-01

    We derive statistical properties of standard methods for monitoring of habitat cover worldwide, and criticize them in the context of mandated seagrass monitoring programs, as exemplified by Posidonia oceanica in the Mediterranean Sea. We report the novel result that cartographic methods with non-trivial classification errors are generally incapable of reliably detecting habitat cover losses less than about 30 to 50%, and the field labor required to increase their precision can be orders of magnitude higher than that required to estimate habitat loss directly in a field campaign. We derive a universal utility threshold of classification error in habitat maps that represents the minimum habitat map accuracy above which direct methods are superior. Widespread government reliance on blind-sentinel methods for monitoring seafloor can obscure the gradual and currently ongoing losses of benthic resources until the time has long passed for meaningful management intervention. We find two classes of methods with very high statistical power for detecting small habitat cover losses: 1) fixed-plot direct methods, which are over 100 times as efficient as direct random-plot methods in a variable habitat mosaic; and 2) remote methods with very low classification error such as geospatial underwater videography, which is an emerging, low-cost, non-destructive method for documenting small changes at millimeter visual resolution. General adoption of these methods and their further development will require a fundamental cultural change in conservation and management bodies towards the recognition and promotion of requirements of minimal statistical power and precision in the development of international goals for monitoring these valuable resources and the ecological services they provide. PMID:26367863

  2. Evaluating concentration estimation errors in ELISA microarray experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daly, Don S.; White, Amanda M.; Varnum, Susan M.

    Enzyme-linked immunosorbent assay (ELISA) is a standard immunoassay to predict a protein concentration in a sample. Deploying ELISA in a microarray format permits simultaneous prediction of the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Evaluating prediction error is critical to interpreting biological significance and improving the ELISA microarray process. Evaluating prediction error must be automated to realize a reliable high-throughput ELISA microarray system. Methods: In this paper, we present a statistical method based on propagation of error to evaluate prediction errors in the ELISA microarray process. Althoughmore » propagation of error is central to this method, it is effective only when comparable data are available. Therefore, we briefly discuss the roles of experimental design, data screening, normalization and statistical diagnostics when evaluating ELISA microarray prediction errors. We use an ELISA microarray investigation of breast cancer biomarkers to illustrate the evaluation of prediction errors. The illustration begins with a description of the design and resulting data, followed by a brief discussion of data screening and normalization. In our illustration, we fit a standard curve to the screened and normalized data, review the modeling diagnostics, and apply propagation of error.« less

  3. A Hands-On Exercise Improves Understanding of the Standard Error of the Mean

    ERIC Educational Resources Information Center

    Ryan, Robert S.

    2006-01-01

    One of the most difficult concepts for statistics students is the standard error of the mean. To improve understanding of this concept, 1 group of students used a hands-on procedure to sample from small populations representing either a true or false null hypothesis. The distribution of 120 sample means (n = 3) from each population had standard…

  4. Does size matter? Statistical limits of paleomagnetic field reconstruction from small rock specimens

    NASA Astrophysics Data System (ADS)

    Berndt, Thomas; Muxworthy, Adrian R.; Fabian, Karl

    2016-01-01

    As samples of ever decreasing sizes are being studied paleomagnetically, care has to be taken that the underlying assumptions of statistical thermodynamics (Maxwell-Boltzmann statistics) are being met. Here we determine how many grains and how large a magnetic moment a sample needs to have to be able to accurately record an ambient field. It is found that for samples with a thermoremanent magnetic moment larger than 10-11Am2 the assumption of a sufficiently large number of grains is usually given. Standard 25 mm diameter paleomagnetic samples usually contain enough magnetic grains such that statistical errors are negligible, but "single silicate crystal" works on, for example, zircon, plagioclase, and olivine crystals are approaching the limits of what is physically possible, leading to statistic errors in both the angular deviation and paleointensity that are comparable to other sources of error. The reliability of nanopaleomagnetic imaging techniques capable of resolving individual grains (used, for example, to study the cloudy zone in meteorites), however, is questionable due to the limited area of the material covered.

  5. A General Linear Method for Equating with Small Samples

    ERIC Educational Resources Information Center

    Albano, Anthony D.

    2015-01-01

    Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…

  6. Small sample mediation testing: misplaced confidence in bootstrapped confidence intervals.

    PubMed

    Koopman, Joel; Howe, Michael; Hollenbeck, John R; Sin, Hock-Peng

    2015-01-01

    Bootstrapping is an analytical tool commonly used in psychology to test the statistical significance of the indirect effect in mediation models. Bootstrapping proponents have particularly advocated for its use for samples of 20-80 cases. This advocacy has been heeded, especially in the Journal of Applied Psychology, as researchers are increasingly utilizing bootstrapping to test mediation with samples in this range. We discuss reasons to be concerned with this escalation, and in a simulation study focused specifically on this range of sample sizes, we demonstrate not only that bootstrapping has insufficient statistical power to provide a rigorous hypothesis test in most conditions but also that bootstrapping has a tendency to exhibit an inflated Type I error rate. We then extend our simulations to investigate an alternative empirical resampling method as well as a Bayesian approach and demonstrate that they exhibit comparable statistical power to bootstrapping in small samples without the associated inflated Type I error. Implications for researchers testing mediation hypotheses in small samples are presented. For researchers wishing to use these methods in their own research, we have provided R syntax in the online supplemental materials. (c) 2015 APA, all rights reserved.

  7. Comparing a single case to a control group - Applying linear mixed effects models to repeated measures data.

    PubMed

    Huber, Stefan; Klein, Elise; Moeller, Korbinian; Willmes, Klaus

    2015-10-01

    In neuropsychological research, single-cases are often compared with a small control sample. Crawford and colleagues developed inferential methods (i.e., the modified t-test) for such a research design. In the present article, we suggest an extension of the methods of Crawford and colleagues employing linear mixed models (LMM). We first show that a t-test for the significance of a dummy coded predictor variable in a linear regression is equivalent to the modified t-test of Crawford and colleagues. As an extension to this idea, we then generalized the modified t-test to repeated measures data by using LMMs to compare the performance difference in two conditions observed in a single participant to that of a small control group. The performance of LMMs regarding Type I error rates and statistical power were tested based on Monte-Carlo simulations. We found that starting with about 15-20 participants in the control sample Type I error rates were close to the nominal Type I error rate using the Satterthwaite approximation for the degrees of freedom. Moreover, statistical power was acceptable. Therefore, we conclude that LMMs can be applied successfully to statistically evaluate performance differences between a single-case and a control sample. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Explanation of Two Anomalous Results in Statistical Mediation Analysis.

    PubMed

    Fritz, Matthew S; Taylor, Aaron B; Mackinnon, David P

    2012-01-01

    Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M , a , increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y , b , was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a . Implications of these findings are discussed.

  9. Deriving Color-Color Transformations for VRI Photometry

    NASA Astrophysics Data System (ADS)

    Taylor, B. J.; Joner, M. D.

    2006-12-01

    In this paper, transformations between Cousins R-I and other indices are considered. New transformations to Cousins V-R and Johnson V-K are derived, a published transformation involving T1-T2 on the Washington system is rederived, and the basis for a transformation involving b-y is considered. In addition, a statistically rigorous procedure for deriving such transformations is presented and discussed in detail. Highlights of the discussion include (1) the need for statistical analysis when least-squares relations are determined and interpreted, (2) the permitted forms and best forms for such relations, (3) the essential role played by accidental errors, (4) the decision process for selecting terms to appear in the relations, (5) the use of plots of residuals, (6) detection of influential data, (7) a protocol for assessing systematic effects from absorption features and other sources, (8) the reasons for avoiding extrapolation of the relations, (9) a protocol for ensuring uniformity in data used to determine the relations, and (10) the derivation and testing of the accidental errors of those data. To put the last of these subjects in perspective, it is shown that rms errors for VRI photometry have been as small as 6 mmag for more than three decades and that standard errors for quantities derived from such photometry can be as small as 1 mmag or less.

  10. Accuracy assessment in the Large Area Crop Inventory Experiment

    NASA Technical Reports Server (NTRS)

    Houston, A. G.; Pitts, D. E.; Feiveson, A. H.; Badhwar, G.; Ferguson, M.; Hsu, E.; Potter, J.; Chhikara, R.; Rader, M.; Ahlers, C.

    1979-01-01

    The Accuracy Assessment System (AAS) of the Large Area Crop Inventory Experiment (LACIE) was responsible for determining the accuracy and reliability of LACIE estimates of wheat production, area, and yield, made at regular intervals throughout the crop season, and for investigating the various LACIE error sources, quantifying these errors, and relating them to their causes. Some results of using the AAS during the three years of LACIE are reviewed. As the program culminated, AAS was able not only to meet the goal of obtaining accurate statistical estimates of sampling and classification accuracy, but also the goal of evaluating component labeling errors. Furthermore, the ground-truth data processing matured from collecting data for one crop (small grains) to collecting, quality-checking, and archiving data for all crops in a LACIE small segment.

  11. Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables.

    PubMed

    Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu

    2015-06-01

    Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.

  12. Statistical model specification and power: recommendations on the use of test-qualified pooling in analysis of experimental data

    PubMed Central

    Colegrave, Nick

    2017-01-01

    A common approach to the analysis of experimental data across much of the biological sciences is test-qualified pooling. Here non-significant terms are dropped from a statistical model, effectively pooling the variation associated with each removed term with the error term used to test hypotheses (or estimate effect sizes). This pooling is only carried out if statistical testing on the basis of applying that data to a previous more complicated model provides motivation for this model simplification; hence the pooling is test-qualified. In pooling, the researcher increases the degrees of freedom of the error term with the aim of increasing statistical power to test their hypotheses of interest. Despite this approach being widely adopted and explicitly recommended by some of the most widely cited statistical textbooks aimed at biologists, here we argue that (except in highly specialized circumstances that we can identify) the hoped-for improvement in statistical power will be small or non-existent, and there is likely to be much reduced reliability of the statistical procedures through deviation of type I error rates from nominal levels. We thus call for greatly reduced use of test-qualified pooling across experimental biology, more careful justification of any use that continues, and a different philosophy for initial selection of statistical models in the light of this change in procedure. PMID:28330912

  13. Tests of Independence in Contingency Tables with Small Samples: A Comparison of Statistical Power.

    ERIC Educational Resources Information Center

    Parshall, Cynthia G.; Kromrey, Jeffrey D.

    1996-01-01

    Power and Type I error rates were estimated for contingency tables with small sample sizes for the following four types of tests: (1) Pearson's chi-square; (2) chi-square with Yates's continuity correction; (3) the likelihood ratio test; and (4) Fisher's Exact Test. Various marginal distributions, sample sizes, and effect sizes were examined. (SLD)

  14. Statistical Properties of SEE Rate Calculation in the Limits of Large and Small Event Counts

    NASA Technical Reports Server (NTRS)

    Ladbury, Ray

    2007-01-01

    This viewgraph presentation reviews the Statistical properties of Single Event Effects (SEE) rate calculations. The goal of SEE rate calculation is to bound the SEE rate, though the question is by how much. The presentation covers: (1) Understanding errors on SEE cross sections, (2) Methodology: Maximum Likelihood and confidence Contours, (3) Tests with Simulated data and (4) Applications.

  15. Dealing with AFLP genotyping errors to reveal genetic structure in Plukenetia volubilis (Euphorbiaceae) in the Peruvian Amazon

    PubMed Central

    Vašek, Jakub; Viehmannová, Iva; Ocelák, Martin; Cachique Huansi, Danter; Vejl, Pavel

    2017-01-01

    An analysis of the population structure and genetic diversity for any organism often depends on one or more molecular marker techniques. Nonetheless, these techniques are not absolutely reliable because of various sources of errors arising during the genotyping process. Thus, a complex analysis of genotyping error was carried out with the AFLP method in 169 samples of the oil seed plant Plukenetia volubilis L. from small isolated subpopulations in the Peruvian Amazon. Samples were collected in nine localities from the region of San Martin. Analysis was done in eight datasets with a genotyping error from 0 to 5%. Using eleven primer combinations, 102 to 275 markers were obtained according to the dataset. It was found that it is only possible to obtain the most reliable and robust results through a multiple-level filtering process. Genotyping error and software set up influence both the estimation of population structure and genetic diversity, where in our case population number (K) varied between 2–9 depending on the dataset and statistical method used. Surprisingly, discrepancies in K number were caused more by statistical approaches than by genotyping errors themselves. However, for estimation of genetic diversity, the degree of genotyping error was critical because descriptive parameters (He, FST, PLP 5%) varied substantially (by at least 25%). Due to low gene flow, P. volubilis mostly consists of small isolated subpopulations (ΦPT = 0.252–0.323) with some degree of admixture given by socio-economic connectivity among the sites; a direct link between the genetic and geographic distances was not confirmed. The study illustrates the successful application of AFLP to infer genetic structure in non-model plants. PMID:28910307

  16. Improving the analysis of composite endpoints in rare disease trials.

    PubMed

    McMenamin, Martina; Berglind, Anna; Wason, James M S

    2018-05-22

    Composite endpoints are recommended in rare diseases to increase power and/or to sufficiently capture complexity. Often, they are in the form of responder indices which contain a mixture of continuous and binary components. Analyses of these outcomes typically treat them as binary, thus only using the dichotomisations of continuous components. The augmented binary method offers a more efficient alternative and is therefore especially useful for rare diseases. Previous work has indicated the method may have poorer statistical properties when the sample size is small. Here we investigate small sample properties and implement small sample corrections. We re-sample from a previous trial with sample sizes varying from 30 to 80. We apply the standard binary and augmented binary methods and determine the power, type I error rate, coverage and average confidence interval width for each of the estimators. We implement Firth's adjustment for the binary component models and a small sample variance correction for the generalized estimating equations, applying the small sample adjusted methods to each sub-sample as before for comparison. For the log-odds treatment effect the power of the augmented binary method is 20-55% compared to 12-20% for the standard binary method. Both methods have approximately nominal type I error rates. The difference in response probabilities exhibit similar power but both unadjusted methods demonstrate type I error rates of 6-8%. The small sample corrected methods have approximately nominal type I error rates. On both scales, the reduction in average confidence interval width when using the adjusted augmented binary method is 17-18%. This is equivalent to requiring a 32% smaller sample size to achieve the same statistical power. The augmented binary method with small sample corrections provides a substantial improvement for rare disease trials using composite endpoints. We recommend the use of the method for the primary analysis in relevant rare disease trials. We emphasise that the method should be used alongside other efforts in improving the quality of evidence generated from rare disease trials rather than replace them.

  17. Rank score and permutation testing alternatives for regression quantile estimates

    USGS Publications Warehouse

    Cade, B.S.; Richards, J.D.; Mielke, P.W.

    2006-01-01

    Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.

  18. Round-off errors in cutting plane algorithms based on the revised simplex procedure

    NASA Technical Reports Server (NTRS)

    Moore, J. E.

    1973-01-01

    This report statistically analyzes computational round-off errors associated with the cutting plane approach to solving linear integer programming problems. Cutting plane methods require that the inverse of a sequence of matrices be computed. The problem basically reduces to one of minimizing round-off errors in the sequence of inverses. Two procedures for minimizing this problem are presented, and their influence on error accumulation is statistically analyzed. One procedure employs a very small tolerance factor to round computed values to zero. The other procedure is a numerical analysis technique for reinverting or improving the approximate inverse of a matrix. The results indicated that round-off accumulation can be effectively minimized by employing a tolerance factor which reflects the number of significant digits carried for each calculation and by applying the reinversion procedure once to each computed inverse. If 18 significant digits plus an exponent are carried for each variable during computations, then a tolerance value of 0.1 x 10 to the minus 12th power is reasonable.

  19. Ephemeris data and error analysis in support of a Comet Encke intercept mission

    NASA Technical Reports Server (NTRS)

    Yeomans, D. K.

    1974-01-01

    Utilizing an orbit determination based upon 65 observations over the 1961 - 1973 interval, ephemeris data were generated for the 1976-77, 1980-81 and 1983-84 apparitions of short period comet Encke. For the 1980-81 apparition, results from a statistical error analysis are outlined. All ephemeris and error analysis computations include the effects of planetary perturbations as well as the nongravitational accelerations introduced by the outgassing cometary nucleus. In 1980, excellent observing conditions and a close approach of comet Encke to the earth permit relatively small uncertainties in the cometary position errors and provide an excellent opportunity for a close flyby of a physically interesting comet.

  20. Comparison of HSPF and PRMS model simulated flows using different temporal and spatial scales in the Black Hills, South Dakota

    USGS Publications Warehouse

    Chalise, D. R.; Haj, Adel E.; Fontaine, T.A.

    2018-01-01

    The hydrological simulation program Fortran (HSPF) [Hydrological Simulation Program Fortran version 12.2 (Computer software). USEPA, Washington, DC] and the precipitation runoff modeling system (PRMS) [Precipitation Runoff Modeling System version 4.0 (Computer software). USGS, Reston, VA] models are semidistributed, deterministic hydrological tools for simulating the impacts of precipitation, land use, and climate on basin hydrology and streamflow. Both models have been applied independently to many watersheds across the United States. This paper reports the statistical results assessing various temporal (daily, monthly, and annual) and spatial (small versus large watershed) scale biases in HSPF and PRMS simulations using two watersheds in the Black Hills, South Dakota. The Nash-Sutcliffe efficiency (NSE), Pearson correlation coefficient (r">rr), and coefficient of determination (R2">R2R2) statistics for the daily, monthly, and annual flows were used to evaluate the models’ performance. Results from the HSPF models showed that the HSPF consistently simulated the annual flows for both large and small basins better than the monthly and daily flows, and the simulated flows for the small watershed better than flows for the large watershed. In comparison, the PRMS model results show that the PRMS simulated the monthly flows for both the large and small watersheds better than the daily and annual flows, and the range of statistical error in the PRMS models was greater than that in the HSPF models. Moreover, it can be concluded that the statistical error in the HSPF and the PRMSdaily, monthly, and annual flow estimates for watersheds in the Black Hills was influenced by both temporal and spatial scale variability.

  1. Comparing interval estimates for small sample ordinal CFA models

    PubMed Central

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002

  2. Comparing interval estimates for small sample ordinal CFA models.

    PubMed

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.

  3. Robust functional statistics applied to Probability Density Function shape screening of sEMG data.

    PubMed

    Boudaoud, S; Rix, H; Al Harrach, M; Marin, F

    2014-01-01

    Recent studies pointed out possible shape modifications of the Probability Density Function (PDF) of surface electromyographical (sEMG) data according to several contexts like fatigue and muscle force increase. Following this idea, criteria have been proposed to monitor these shape modifications mainly using High Order Statistics (HOS) parameters like skewness and kurtosis. In experimental conditions, these parameters are confronted with small sample size in the estimation process. This small sample size induces errors in the estimated HOS parameters restraining real-time and precise sEMG PDF shape monitoring. Recently, a functional formalism, the Core Shape Model (CSM), has been used to analyse shape modifications of PDF curves. In this work, taking inspiration from CSM method, robust functional statistics are proposed to emulate both skewness and kurtosis behaviors. These functional statistics combine both kernel density estimation and PDF shape distances to evaluate shape modifications even in presence of small sample size. Then, the proposed statistics are tested, using Monte Carlo simulations, on both normal and Log-normal PDFs that mimic observed sEMG PDF shape behavior during muscle contraction. According to the obtained results, the functional statistics seem to be more robust than HOS parameters to small sample size effect and more accurate in sEMG PDF shape screening applications.

  4. Power, effects, confidence, and significance: an investigation of statistical practices in nursing research.

    PubMed

    Gaskin, Cadeyrn J; Happell, Brenda

    2014-05-01

    To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  5. Residue frequencies and pairing preferences at protein-protein interfaces.

    PubMed

    Glaser, F; Steinberg, D M; Vakser, I A; Ben-Tal, N

    2001-05-01

    We used a nonredundant set of 621 protein-protein interfaces of known high-resolution structure to derive residue composition and residue-residue contact preferences. The residue composition at the interfaces, in entire proteins and in whole genomes correlates well, indicating the statistical strength of the data set. Differences between amino acid distributions were observed for interfaces with buried surface area of less than 1,000 A(2) versus interfaces with area of more than 5,000 A(2). Hydrophobic residues were abundant in large interfaces while polar residues were more abundant in small interfaces. The largest residue-residue preferences at the interface were recorded for interactions between pairs of large hydrophobic residues, such as Trp and Leu, and the smallest preferences for pairs of small residues, such as Gly and Ala. On average, contacts between pairs of hydrophobic and polar residues were unfavorable, and the charged residues tended to pair subject to charge complementarity, in agreement with previous reports. A bootstrap procedure, lacking from previous studies, was used for error estimation. It showed that the statistical errors in the set of pairing preferences are generally small; the average standard error is approximately 0.2, i.e., about 8% of the average value of the pairwise index (2.9). However, for a few pairs (e.g., Ser-Ser and Glu-Asp) the standard error is larger in magnitude than the pairing index, which makes it impossible to tell whether contact formation is favorable or unfavorable. The results are interpreted using physicochemical factors and their implications for the energetics of complex formation and for protein docking are discussed. Proteins 2001;43:89-102. Copyright 2001 Wiley-Liss, Inc.

  6. Simultaneous Control of Error Rates in fMRI Data Analysis

    PubMed Central

    Kang, Hakmook; Blume, Jeffrey; Ombao, Hernando; Badre, David

    2015-01-01

    The key idea of statistical hypothesis testing is to fix, and thereby control, the Type I error (false positive) rate across samples of any size. Multiple comparisons inflate the global (family-wise) Type I error rate and the traditional solution to maintaining control of the error rate is to increase the local (comparison-wise) Type II error (false negative) rates. However, in the analysis of human brain imaging data, the number of comparisons is so large that this solution breaks down: the local Type II error rate ends up being so large that scientifically meaningful analysis is precluded. Here we propose a novel solution to this problem: allow the Type I error rate to converge to zero along with the Type II error rate. It works because when the Type I error rate per comparison is very small, the accumulation (or global) Type I error rate is also small. This solution is achieved by employing the Likelihood paradigm, which uses likelihood ratios to measure the strength of evidence on a voxel-by-voxel basis. In this paper, we provide theoretical and empirical justification for a likelihood approach to the analysis of human brain imaging data. In addition, we present extensive simulations that show the likelihood approach is viable, leading to ‘cleaner’ looking brain maps and operationally superiority (lower average error rate). Finally, we include a case study on cognitive control related activation in the prefrontal cortex of the human brain. PMID:26272730

  7. Investigation of Error Patterns in Geographical Databases

    NASA Technical Reports Server (NTRS)

    Dryer, David; Jacobs, Derya A.; Karayaz, Gamze; Gronbech, Chris; Jones, Denise R. (Technical Monitor)

    2002-01-01

    The objective of the research conducted in this project is to develop a methodology to investigate the accuracy of Airport Safety Modeling Data (ASMD) using statistical, visualization, and Artificial Neural Network (ANN) techniques. Such a methodology can contribute to answering the following research questions: Over a representative sampling of ASMD databases, can statistical error analysis techniques be accurately learned and replicated by ANN modeling techniques? This representative ASMD sample should include numerous airports and a variety of terrain characterizations. Is it possible to identify and automate the recognition of patterns of error related to geographical features? Do such patterns of error relate to specific geographical features, such as elevation or terrain slope? Is it possible to combine the errors in small regions into an error prediction for a larger region? What are the data density reduction implications of this work? ASMD may be used as the source of terrain data for a synthetic visual system to be used in the cockpit of aircraft when visual reference to ground features is not possible during conditions of marginal weather or reduced visibility. In this research, United States Geologic Survey (USGS) digital elevation model (DEM) data has been selected as the benchmark. Artificial Neural Networks (ANNS) have been used and tested as alternate methods in place of the statistical methods in similar problems. They often perform better in pattern recognition, prediction and classification and categorization problems. Many studies show that when the data is complex and noisy, the accuracy of ANN models is generally higher than those of comparable traditional methods.

  8. Comparison of Moderate- to High-Astigmatism Corrections Using WaveFront-Guided Laser In Situ Keratomileusis and Small-Incision Lenticule Extraction.

    PubMed

    Zhang, Jiamei; Wang, Yan; Chen, Xiaoqin

    2016-04-01

    To evaluate and compare refractive outcomes of moderate- and high-astigmatism correction after wavefront-guided laser in situ keratomileusis (LASIK) and small-incision lenticule extraction (SMILE). This comparative study enrolled a total of 64 eyes that had undergone SMILE (42 eyes) and wavefront-guided LASIK (22 eyes). Preoperative cylindrical diopters were ≤-2.25 D in moderate- and >-2.25 D in high-astigmatism subgroups. The refractive results were analyzed based on the Alpins vector method that included target-induced astigmatism, surgically induced astigmatism, difference vector, correction index, index of success, magnitude of error, angle of error, and flattening index. All subjects completed the 3-month follow-up. No significant differences were found in the target-induced astigmatism, surgically induced astigmatism, and difference vector between SMILE and wavefront-guided LASIK. However, the average angle of error value was -1.00 ± 3.16 after wavefront-guided LASIK and 1.22 ± 3.85 after SMILE with statistical significance (P < 0.05). The absolute angle of error value was statistically correlated with difference vector and index of success after both procedures. In the moderate-astigmatism group, correction index was 1.04 ± 0.15 after wavefront-guided LASIK and 0.88 ± 0.15 after SMILE (P < 0.05). However, in the high-astigmatism group, correction index was 0.87 ± 0.13 after wavefront-guided LASIK and 0.88 ± 0.12 after SMILE (P = 0.889). Both procedures showed preferable outcomes in the correction of moderate and high astigmatism. However, high astigmatism was undercorrected after both procedures. Axial error of astigmatic correction may be one of the potential factors for the undercorrection.

  9. Probabilistic performance estimators for computational chemistry methods: The empirical cumulative distribution function of absolute errors

    NASA Astrophysics Data System (ADS)

    Pernot, Pascal; Savin, Andreas

    2018-06-01

    Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.

  10. Optimization of the Hartmann-Shack microlens array

    NASA Astrophysics Data System (ADS)

    de Oliveira, Otávio Gomes; de Lima Monteiro, Davies William

    2011-04-01

    In this work we propose to optimize the microlens-array geometry for a Hartmann-Shack wavefront sensor. The optimization makes possible that regular microlens arrays with a larger number of microlenses are replaced by arrays with fewer microlenses located at optimal sampling positions, with no increase in the reconstruction error. The goal is to propose a straightforward and widely accessible numerical method to calculate an optimized microlens array for a known aberration statistics. The optimization comprises the minimization of the wavefront reconstruction error and/or the number of necessary microlenses in the array. We numerically generate, sample and reconstruct the wavefront, and use a genetic algorithm to discover the optimal array geometry. Within an ophthalmological context, as a case study, we demonstrate that an array with only 10 suitably located microlenses can be used to produce reconstruction errors as small as those of a 36-microlens regular array. The same optimization procedure can be employed for any application where the wavefront statistics is known.

  11. Topology of Neutral Hydrogen within the Small Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Chepurnov, A.; Gordon, J.; Lazarian, A.; Stanimirovic, S.

    2008-12-01

    In this paper, genus statistics have been applied to an H I column density map of the Small Magellanic Cloud in order to study its topology. To learn how topology changes with the scale of the system, we provide topology studies for column density maps at varying resolutions. To evaluate the statistical error of the genus, we randomly reassign the phases of the Fourier modes while keeping the amplitudes. We find that at the smallest scales studied (40 pc <= λ <= 80 pc), the genus shift is negative in all regions, implying a clump topology. At the larger scales (110 pc <= λ <= 250 pc), the topology shift is detected to be negative (a "meatball" topology) in four cases and positive (a "swiss cheese" topology) in two cases. In four regions, there is no statistically significant topology shift at large scales.

  12. Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin.

    PubMed

    Sagers, Jason D; Knobles, David P

    2014-06-01

    Addressed is the statistical inference of the sound-speed depth profile of a thick soft seabed from broadband sound propagation data recorded in the Gulf of Oman Basin in 1977. The acoustic data are in the form of time series signals recorded on a sparse vertical line array and generated by explosive sources deployed along a 280 km track. The acoustic data offer a unique opportunity to study a deep-water bottom-limited thickly sedimented environment because of the large number of time series measurements, very low seabed attenuation, and auxiliary measurements. A maximum entropy method is employed to obtain a conditional posterior probability distribution (PPD) for the sound-speed ratio and the near-surface sound-speed gradient. The multiple data samples allow for a determination of the average error constraint value required to uniquely specify the PPD for each data sample. Two complicating features of the statistical inference study are addressed: (1) the need to develop an error function that can both utilize the measured multipath arrival structure and mitigate the effects of data errors and (2) the effect of small bathymetric slopes on the structure of the bottom interacting arrivals.

  13. Adaptation of a Fast Optimal Interpolation Algorithm to the Mapping of Oceangraphic Data

    NASA Technical Reports Server (NTRS)

    Menemenlis, Dimitris; Fieguth, Paul; Wunsch, Carl; Willsky, Alan

    1997-01-01

    A fast, recently developed, multiscale optimal interpolation algorithm has been adapted to the mapping of hydrographic and other oceanographic data. This algorithm produces solution and error estimates which are consistent with those obtained from exact least squares methods, but at a small fraction of the computational cost. Problems whose solution would be completely impractical using exact least squares, that is, problems with tens or hundreds of thousands of measurements and estimation grid points, can easily be solved on a small workstation using the multiscale algorithm. In contrast to methods previously proposed for solving large least squares problems, our approach provides estimation error statistics while permitting long-range correlations, using all measurements, and permitting arbitrary measurement locations. The multiscale algorithm itself, published elsewhere, is not the focus of this paper. However, the algorithm requires statistical models having a very particular multiscale structure; it is the development of a class of multiscale statistical models, appropriate for oceanographic mapping problems, with which we concern ourselves in this paper. The approach is illustrated by mapping temperature in the northeastern Pacific. The number of hydrographic stations is kept deliberately small to show that multiscale and exact least squares results are comparable. A portion of the data were not used in the analysis; these data serve to test the multiscale estimates. A major advantage of the present approach is the ability to repeat the estimation procedure a large number of times for sensitivity studies, parameter estimation, and model testing. We have made available by anonymous Ftp a set of MATLAB-callable routines which implement the multiscale algorithm and the statistical models developed in this paper.

  14. The Influence of Observation Errors on Analysis Error and Forecast Skill Investigated with an Observing System Simulation Experiment

    NASA Technical Reports Server (NTRS)

    Prive, N. C.; Errico, R. M.; Tai, K.-S.

    2013-01-01

    The Global Modeling and Assimilation Office (GMAO) observing system simulation experiment (OSSE) framework is used to explore the response of analysis error and forecast skill to observation quality. In an OSSE, synthetic observations may be created that have much smaller error than real observations, and precisely quantified error may be applied to these synthetic observations. Three experiments are performed in which synthetic observations with magnitudes of applied observation error that vary from zero to twice the estimated realistic error are ingested into the Goddard Earth Observing System Model (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation for a one-month period representing July. The analysis increment and observation innovation are strongly impacted by observation error, with much larger variances for increased observation error. The analysis quality is degraded by increased observation error, but the change in root-mean-square error of the analysis state is small relative to the total analysis error. Surprisingly, in the 120 hour forecast increased observation error only yields a slight decline in forecast skill in the extratropics, and no discernable degradation of forecast skill in the tropics.

  15. Empirical best linear unbiased prediction method for small areas with restricted maximum likelihood and bootstrap procedure to estimate the average of household expenditure per capita in Banjar Regency

    NASA Astrophysics Data System (ADS)

    Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho

    2017-03-01

    So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.

  16. Estimating error statistics for Chambon-la-Forêt observatory definitive data

    NASA Astrophysics Data System (ADS)

    Lesur, Vincent; Heumez, Benoît; Telali, Abdelkader; Lalanne, Xavier; Soloviev, Anatoly

    2017-08-01

    We propose a new algorithm for calibrating definitive observatory data with the goal of providing users with estimates of the data error standard deviations (SDs). The algorithm has been implemented and tested using Chambon-la-Forêt observatory (CLF) data. The calibration process uses all available data. It is set as a large, weakly non-linear, inverse problem that ultimately provides estimates of baseline values in three orthogonal directions, together with their expected standard deviations. For this inverse problem, absolute data error statistics are estimated from two series of absolute measurements made within a day. Similarly, variometer data error statistics are derived by comparing variometer data time series between different pairs of instruments over few years. The comparisons of these time series led us to use an autoregressive process of order 1 (AR1 process) as a prior for the baselines. Therefore the obtained baselines do not vary smoothly in time. They have relatively small SDs, well below 300 pT when absolute data are recorded twice a week - i.e. within the daily to weekly measures recommended by INTERMAGNET. The algorithm was tested against the process traditionally used to derive baselines at CLF observatory, suggesting that statistics are less favourable when this latter process is used. Finally, two sets of definitive data were calibrated using the new algorithm. Their comparison shows that the definitive data SDs are less than 400 pT and may be slightly overestimated by our process: an indication that more work is required to have proper estimates of absolute data error statistics. For magnetic field modelling, the results show that even on isolated sites like CLF observatory, there are very localised signals over a large span of temporal frequencies that can be as large as 1 nT. The SDs reported here encompass signals of a few hundred metres and less than a day wavelengths.

  17. Testing non-inferiority of a new treatment in three-arm clinical trials with binary endpoints.

    PubMed

    Tang, Nian-Sheng; Yu, Bin; Tang, Man-Lai

    2014-12-18

    A two-arm non-inferiority trial without a placebo is usually adopted to demonstrate that an experimental treatment is not worse than a reference treatment by a small pre-specified non-inferiority margin due to ethical concerns. Selection of the non-inferiority margin and establishment of assay sensitivity are two major issues in the design, analysis and interpretation for two-arm non-inferiority trials. Alternatively, a three-arm non-inferiority clinical trial including a placebo is usually conducted to assess the assay sensitivity and internal validity of a trial. Recently, some large-sample approaches have been developed to assess the non-inferiority of a new treatment based on the three-arm trial design. However, these methods behave badly with small sample sizes in the three arms. This manuscript aims to develop some reliable small-sample methods to test three-arm non-inferiority. Saddlepoint approximation, exact and approximate unconditional, and bootstrap-resampling methods are developed to calculate p-values of the Wald-type, score and likelihood ratio tests. Simulation studies are conducted to evaluate their performance in terms of type I error rate and power. Our empirical results show that the saddlepoint approximation method generally behaves better than the asymptotic method based on the Wald-type test statistic. For small sample sizes, approximate unconditional and bootstrap-resampling methods based on the score test statistic perform better in the sense that their corresponding type I error rates are generally closer to the prespecified nominal level than those of other test procedures. Both approximate unconditional and bootstrap-resampling test procedures based on the score test statistic are generally recommended for three-arm non-inferiority trials with binary outcomes.

  18. Developing stochastic model of thrust and flight dynamics for small UAVs

    NASA Astrophysics Data System (ADS)

    Tjhai, Chandra

    This thesis presents a stochastic thrust model and aerodynamic model for small propeller driven UAVs whose power plant is a small electric motor. First a model which relates thrust generated by a small propeller driven electric motor as a function of throttle setting and commanded engine RPM is developed. A perturbation of this model is then used to relate the uncertainty in throttle and engine RPM commanded to the error in the predicted thrust. Such a stochastic model is indispensable in the design of state estimation and control systems for UAVs where the performance requirements of the systems are specied in stochastic terms. It is shown that thrust prediction models for small UAVs are not a simple, explicit functions relating throttle input and RPM command to thrust generated. Rather they are non-linear, iterative procedures which depend on a geometric description of the propeller and mathematical model of the motor. A detailed derivation of the iterative procedure is presented and the impact of errors which arise from inaccurate propeller and motor descriptions are discussed. Validation results from a series of wind tunnel tests are presented. The results show a favorable statistical agreement between the thrust uncertainty predicted by the model and the errors measured in the wind tunnel. The uncertainty model of aircraft aerodynamic coefficients developed based on wind tunnel experiment will be discussed at the end of this thesis.

  19. Bias and heteroscedastic memory error in self-reported health behavior: an investigation using covariance structure analysis

    PubMed Central

    Kupek, Emil

    2002-01-01

    Background Frequent use of self-reports for investigating recent and past behavior in medical research requires statistical techniques capable of analyzing complex sources of bias associated with this methodology. In particular, although decreasing accuracy of recalling more distant past events is commonplace, the bias due to differential in memory errors resulting from it has rarely been modeled statistically. Methods Covariance structure analysis was used to estimate the recall error of self-reported number of sexual partners for past periods of varying duration and its implication for the bias. Results Results indicated increasing levels of inaccuracy for reports about more distant past. Considerable positive bias was found for a small fraction of respondents who reported ten or more partners in the last year, last two years and last five years. This is consistent with the effect of heteroscedastic random error where the majority of partners had been acquired in the more distant past and therefore were recalled less accurately than the partners acquired more recently to the time of interviewing. Conclusions Memory errors of this type depend on the salience of the events recalled and are likely to be present in many areas of health research based on self-reported behavior. PMID:12435276

  20. Variability of rainfall over small areas

    NASA Technical Reports Server (NTRS)

    Runnels, R. C.

    1983-01-01

    A preliminary investigation was made to determine estimates of the number of raingauges needed in order to measure the variability of rainfall in time and space over small areas (approximately 40 sq miles). The literature on rainfall variability was examined and the types of empirical relationships used to relate rainfall variations to meteorological and catchment-area characteristics were considered. Relations between the coefficient of variation and areal-mean rainfall and area have been used by several investigators. These parameters seemed reasonable ones to use in any future study of rainfall variations. From a knowledge of an appropriate coefficient of variation (determined by the above-mentioned relations) the number rain gauges needed for the precise determination of areal-mean rainfall may be calculated by statistical estimation theory. The number gauges needed to measure the coefficient of variation over a 40 sq miles area, with varying degrees of error, was found to range from 264 (10% error, mean precipitation = 0.1 in) to about 2 (100% error, mean precipitation = 0.1 in).

  1. Drought Persistence Errors in Global Climate Models

    NASA Astrophysics Data System (ADS)

    Moon, H.; Gudmundsson, L.; Seneviratne, S. I.

    2018-04-01

    The persistence of drought events largely determines the severity of socioeconomic and ecological impacts, but the capability of current global climate models (GCMs) to simulate such events is subject to large uncertainties. In this study, the representation of drought persistence in GCMs is assessed by comparing state-of-the-art GCM model simulations to observation-based data sets. For doing so, we consider dry-to-dry transition probabilities at monthly and annual scales as estimates for drought persistence, where a dry status is defined as negative precipitation anomaly. Though there is a substantial spread in the drought persistence bias, most of the simulations show systematic underestimation of drought persistence at global scale. Subsequently, we analyzed to which degree (i) inaccurate observations, (ii) differences among models, (iii) internal climate variability, and (iv) uncertainty of the employed statistical methods contribute to the spread in drought persistence errors using an analysis of variance approach. The results show that at monthly scale, model uncertainty and observational uncertainty dominate, while the contribution from internal variability is small in most cases. At annual scale, the spread of the drought persistence error is dominated by the statistical estimation error of drought persistence, indicating that the partitioning of the error is impaired by the limited number of considered time steps. These findings reveal systematic errors in the representation of drought persistence in current GCMs and suggest directions for further model improvement.

  2. Does raising type 1 error rate improve power to detect interactions in linear regression models? A simulation study.

    PubMed

    Durand, Casey P

    2013-01-01

    Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model. A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations. In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified. Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.

  3. Low power and type II errors in recent ophthalmology research.

    PubMed

    Khan, Zainab; Milko, Jordan; Iqbal, Munir; Masri, Moness; Almeida, David R P

    2016-10-01

    To investigate the power of unpaired t tests in prospective, randomized controlled trials when these tests failed to detect a statistically significant difference and to determine the frequency of type II errors. Systematic review and meta-analysis. We examined all prospective, randomized controlled trials published between 2010 and 2012 in 4 major ophthalmology journals (Archives of Ophthalmology, British Journal of Ophthalmology, Ophthalmology, and American Journal of Ophthalmology). Studies that used unpaired t tests were included. Power was calculated using the number of subjects in each group, standard deviations, and α = 0.05. The difference between control and experimental means was set to be (1) 20% and (2) 50% of the absolute value of the control's initial conditions. Power and Precision version 4.0 software was used to carry out calculations. Finally, the proportion of articles with type II errors was calculated. β = 0.3 was set as the largest acceptable value for the probability of type II errors. In total, 280 articles were screened. Final analysis included 50 prospective, randomized controlled trials using unpaired t tests. The median power of tests to detect a 50% difference between means was 0.9 and was the same for all 4 journals regardless of the statistical significance of the test. The median power of tests to detect a 20% difference between means ranged from 0.26 to 0.9 for the 4 journals. The median power of these tests to detect a 50% and 20% difference between means was 0.9 and 0.5 for tests that did not achieve statistical significance. A total of 14% and 57% of articles with negative unpaired t tests contained results with β > 0.3 when power was calculated for differences between means of 50% and 20%, respectively. A large portion of studies demonstrate high probabilities of type II errors when detecting small differences between means. The power to detect small difference between means varies across journals. It is, therefore, worthwhile for authors to mention the minimum clinically important difference for individual studies. Journals can consider publishing statistical guidelines for authors to use. Day-to-day clinical decisions rely heavily on the evidence base formed by the plethora of studies available to clinicians. Prospective, randomized controlled clinical trials are highly regarded as a robust study and are used to make important clinical decisions that directly affect patient care. The quality of study designs and statistical methods in major clinical journals is improving overtime, 1 and researchers and journals are being more attentive to statistical methodologies incorporated by studies. The results of well-designed ophthalmic studies with robust methodologies, therefore, have the ability to modify the ways in which diseases are managed. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.

  4. On the assessment of the added value of new predictive biomarkers.

    PubMed

    Chen, Weijie; Samuelson, Frank W; Gallas, Brandon D; Kang, Le; Sahiner, Berkman; Petrick, Nicholas

    2013-07-29

    The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.

  5. Multi-Reader ROC studies with Split-Plot Designs: A Comparison of Statistical Methods

    PubMed Central

    Obuchowski, Nancy A.; Gallas, Brandon D.; Hillis, Stephen L.

    2012-01-01

    Rationale and Objectives Multi-reader imaging trials often use a factorial design, where study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of the design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper we compare three methods of analysis for the split-plot design. Materials and Methods Three statistical methods are presented: Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean ANOVA approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power and confidence interval coverage of the three test statistics. Results The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% CIs fall close to the nominal coverage for small and large sample sizes. Conclusions The split-plot MRMC study design can be statistically efficient compared with the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rate, similar power, and nominal CI coverage, are available for this study design. PMID:23122570

  6. Multi-reader ROC studies with split-plot designs: a comparison of statistical methods.

    PubMed

    Obuchowski, Nancy A; Gallas, Brandon D; Hillis, Stephen L

    2012-12-01

    Multireader imaging trials often use a factorial design, in which study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of this design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper, the authors compare three methods of analysis for the split-plot design. Three statistical methods are presented: the Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean analysis-of-variance approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power, and confidence interval coverage of the three test statistics. The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% confidence intervals falls close to the nominal coverage for small and large sample sizes. The split-plot multireader, multicase study design can be statistically efficient compared to the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rates, similar power, and nominal confidence interval coverage, are available for this study design. Copyright © 2012 AUR. All rights reserved.

  7. Statistical inference involving binomial and negative binomial parameters.

    PubMed

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2009-05-01

    Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.

  8. Improvements in GRACE Gravity Fields Using Regularization

    NASA Astrophysics Data System (ADS)

    Save, H.; Bettadpur, S.; Tapley, B. D.

    2008-12-01

    The unconstrained global gravity field models derived from GRACE are susceptible to systematic errors that show up as broad "stripes" aligned in a North-South direction on the global maps of mass flux. These errors are believed to be a consequence of both systematic and random errors in the data that are amplified by the nature of the gravity field inverse problem. These errors impede scientific exploitation of the GRACE data products, and limit the realizable spatial resolution of the GRACE global gravity fields in certain regions. We use regularization techniques to reduce these "stripe" errors in the gravity field products. The regularization criteria are designed such that there is no attenuation of the signal and that the solutions fit the observations as well as an unconstrained solution. We have used a computationally inexpensive method, normally referred to as "L-ribbon", to find the regularization parameter. This paper discusses the characteristics and statistics of a 5-year time-series of regularized gravity field solutions. The solutions show markedly reduced stripes, are of uniformly good quality over time, and leave little or no systematic observation residuals, which is a frequent consequence of signal suppression from regularization. Up to degree 14, the signal in regularized solution shows correlation greater than 0.8 with the un-regularized CSR Release-04 solutions. Signals from large-amplitude and small-spatial extent events - such as the Great Sumatra Andaman Earthquake of 2004 - are visible in the global solutions without using special post-facto error reduction techniques employed previously in the literature. Hydrological signals as small as 5 cm water-layer equivalent in the small river basins, like Indus and Nile for example, are clearly evident, in contrast to noisy estimates from RL04. The residual variability over the oceans relative to a seasonal fit is small except at higher latitudes, and is evident without the need for de-striping or spatial smoothing.

  9. Changes in Rod and Frame Test Scores Recorded in Schoolchildren during Development – A Longitudinal Study

    PubMed Central

    Bagust, Jeff; Docherty, Sharon; Haynes, Wayne; Telford, Richard; Isableu, Brice

    2013-01-01

    The Rod and Frame Test has been used to assess the degree to which subjects rely on the visual frame of reference to perceive vertical (visual field dependence- independence perceptual style). Early investigations found children exhibited a wide range of alignment errors, which reduced as they matured. These studies used a mechanical Rod and Frame system, and presented only mean values of grouped data. The current study also considered changes in individual performance. Changes in rod alignment accuracy in 419 school children were measured using a computer-based Rod and Frame test. Each child was tested at school Grade 2 and retested in Grades 4 and 6. The results confirmed that children displayed a wide range of alignment errors, which decreased with age but did not reach the expected adult values. Although most children showed a decrease in frame dependency over the 4 years of the study, almost 20% had increased alignment errors suggesting that they were becoming more frame-dependent. Plots of individual variation (SD) against mean error allowed the sample to be divided into 4 groups; the majority with small errors and SDs; a group with small SDs, but alignments clustering around the frame angle of 18°; a group showing large errors in the opposite direction to the frame tilt; and a small number with large SDs whose alignment appeared to be random. The errors in the last 3 groups could largely be explained by alignment of the rod to different aspects of the frame. At corresponding ages females exhibited larger alignment errors than males although this did not reach statistical significance. This study confirms that children rely more heavily on the visual frame of reference for processing spatial orientation cues. Most become less frame-dependent as they mature, but there are considerable individual differences. PMID:23724139

  10. Radar error statistics for the space shuttle

    NASA Technical Reports Server (NTRS)

    Lear, W. M.

    1979-01-01

    Radar error statistics of C-band and S-band that are recommended for use with the groundtracking programs to process space shuttle tracking data are presented. The statistics are divided into two parts: bias error statistics, using the subscript B, and high frequency error statistics, using the subscript q. Bias errors may be slowly varying to constant. High frequency random errors (noise) are rapidly varying and may or may not be correlated from sample to sample. Bias errors were mainly due to hardware defects and to errors in correction for atmospheric refraction effects. High frequency noise was mainly due to hardware and due to atmospheric scintillation. Three types of atmospheric scintillation were identified: horizontal, vertical, and line of sight. This was the first time that horizontal and line of sight scintillations were identified.

  11. Impact of Sample Size and Variability on the Power and Type I Error Rates of Equivalence Tests: A Simulation Study

    ERIC Educational Resources Information Center

    Rusticus, Shayna A.; Lovato, Chris Y.

    2014-01-01

    The question of equivalence between two or more groups is frequently of interest to many applied researchers. Equivalence testing is a statistical method designed to provide evidence that groups are comparable by demonstrating that the mean differences found between groups are small enough that they are considered practically unimportant. Few…

  12. Measurement uncertainty and feasibility study of a flush airdata system for a hypersonic flight experiment

    NASA Technical Reports Server (NTRS)

    Whitmore, Stephen A.; Moes, Timothy R.

    1994-01-01

    Presented is a feasibility and error analysis for a hypersonic flush airdata system on a hypersonic flight experiment (HYFLITE). HYFLITE heating loads make intrusive airdata measurement impractical. Although this analysis is specifically for the HYFLITE vehicle and trajectory, the problems analyzed are generally applicable to hypersonic vehicles. A layout of the flush-port matrix is shown. Surface pressures are related airdata parameters using a simple aerodynamic model. The model is linearized using small perturbations and inverted using nonlinear least-squares. Effects of various error sources on the overall uncertainty are evaluated using an error simulation. Error sources modeled include boundarylayer/viscous interactions, pneumatic lag, thermal transpiration in the sensor pressure tubing, misalignment in the matrix layout, thermal warping of the vehicle nose, sampling resolution, and transducer error. Using simulated pressure data for input to the estimation algorithm, effects caused by various error sources are analyzed by comparing estimator outputs with the original trajectory. To obtain ensemble averages the simulation is run repeatedly and output statistics are compiled. Output errors resulting from the various error sources are presented as a function of Mach number. Final uncertainties with all modeled error sources included are presented as a function of Mach number.

  13. Statistics for X-chromosome associations.

    PubMed

    Özbek, Umut; Lin, Hui-Min; Lin, Yan; Weeks, Daniel E; Chen, Wei; Shaffer, John R; Purcell, Shaun M; Feingold, Eleanor

    2018-06-13

    In a genome-wide association study (GWAS), association between genotype and phenotype at autosomal loci is generally tested by regression models. However, X-chromosome data are often excluded from published analyses of autosomes because of the difference between males and females in number of X chromosomes. Failure to analyze X-chromosome data at all is obviously less than ideal, and can lead to missed discoveries. Even when X-chromosome data are included, they are often analyzed with suboptimal statistics. Several mathematically sensible statistics for X-chromosome association have been proposed. The optimality of these statistics, however, is based on very specific simple genetic models. In addition, while previous simulation studies of these statistics have been informative, they have focused on single-marker tests and have not considered the types of error that occur even under the null hypothesis when the entire X chromosome is scanned. In this study, we comprehensively tested several X-chromosome association statistics using simulation studies that include the entire chromosome. We also considered a wide range of trait models for sex differences and phenotypic effects of X inactivation. We found that models that do not incorporate a sex effect can have large type I error in some cases. We also found that many of the best statistics perform well even when there are modest deviations, such as trait variance differences between the sexes or small sex differences in allele frequencies, from assumptions. © 2018 WILEY PERIODICALS, INC.

  14. Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

    DOE PAGES

    Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

    2017-07-14

    A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less

  15. Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

    A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less

  16. Peripheral refraction and image blur in four meridians in emmetropes and myopes.

    PubMed

    Shen, Jie; Spors, Frank; Egan, Donald; Liu, Chunming

    2018-01-01

    The peripheral refractive error of the human eye has been hypothesized to be a major stimulus for the development of its central refractive error. The purpose of this study was to investigate the changes in the peripheral refractive error across horizontal, vertical and two diagonal meridians in emmetropic and low, moderate and high myopic adults. Thirty-four adult subjects were recruited and aberration was measured using a modified commercial aberrometer. We then computed the refractive error in power vector notation from second-order Zernike terms. Statistical analysis was performed to evaluate the statistical differences in refractive error profiles between the subject groups and across all measured visual field meridians. Small amounts of relative myopic shift were observed in emmetropic and low myopic subjects. However, moderate and high myopic subjects exhibited a relative hyperopic shift in all four meridians. Astigmatism J 0 and J 45 had quadratic or linear changes dependent on the visual field meridians. Peripheral Sphero-Cylindrical Retinal Image Blur increased in emmetropic eyes in most of the measured visual fields. The findings indicate an overall emmetropic or slightly relative myopic periphery (spherical or oblate retinal shape) formed in emmetropes and low myopes, while moderate and high myopes form relative hyperopic periphery (prolate, or less oblate, retinal shape). In general, human emmetropic eyes demonstrate higher amount of peripheral retinal image blur.

  17. Trans-dimensional inversion of microtremor array dispersion data with hierarchical autoregressive error models

    NASA Astrophysics Data System (ADS)

    Dettmer, Jan; Molnar, Sheri; Steininger, Gavin; Dosso, Stan E.; Cassidy, John F.

    2012-02-01

    This paper applies a general trans-dimensional Bayesian inference methodology and hierarchical autoregressive data-error models to the inversion of microtremor array dispersion data for shear wave velocity (vs) structure. This approach accounts for the limited knowledge of the optimal earth model parametrization (e.g. the number of layers in the vs profile) and of the data-error statistics in the resulting vs parameter uncertainty estimates. The assumed earth model parametrization influences estimates of parameter values and uncertainties due to different parametrizations leading to different ranges of data predictions. The support of the data for a particular model is often non-unique and several parametrizations may be supported. A trans-dimensional formulation accounts for this non-uniqueness by including a model-indexing parameter as an unknown so that groups of models (identified by the indexing parameter) are considered in the results. The earth model is parametrized in terms of a partition model with interfaces given over a depth-range of interest. In this work, the number of interfaces (layers) in the partition model represents the trans-dimensional model indexing. In addition, serial data-error correlations are addressed by augmenting the geophysical forward model with a hierarchical autoregressive error model that can account for a wide range of error processes with a small number of parameters. Hence, the limited knowledge about the true statistical distribution of data errors is also accounted for in the earth model parameter estimates, resulting in more realistic uncertainties and parameter values. Hierarchical autoregressive error models do not rely on point estimates of the model vector to estimate data-error statistics, and have no requirement for computing the inverse or determinant of a data-error covariance matrix. This approach is particularly useful for trans-dimensional inverse problems, as point estimates may not be representative of the state space that spans multiple subspaces of different dimensionalities. The order of the autoregressive process required to fit the data is determined here by posterior residual-sample examination and statistical tests. Inference for earth model parameters is carried out on the trans-dimensional posterior probability distribution by considering ensembles of parameter vectors. In particular, vs uncertainty estimates are obtained by marginalizing the trans-dimensional posterior distribution in terms of vs-profile marginal distributions. The methodology is applied to microtremor array dispersion data collected at two sites with significantly different geology in British Columbia, Canada. At both sites, results show excellent agreement with estimates from invasive measurements.

  18. Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain

    PubMed Central

    Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young

    2010-01-01

    Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071

  19. Can Family Planning Service Statistics Be Used to Track Population-Level Outcomes?

    PubMed

    Magnani, Robert J; Ross, John; Williamson, Jessica; Weinberger, Michelle

    2018-03-21

    The need for annual family planning program tracking data under the Family Planning 2020 (FP2020) initiative has contributed to renewed interest in family planning service statistics as a potential data source for annual estimates of the modern contraceptive prevalence rate (mCPR). We sought to assess (1) how well a set of commonly recorded data elements in routine service statistics systems could, with some fairly simple adjustments, track key population-level outcome indicators, and (2) whether some data elements performed better than others. We used data from 22 countries in Africa and Asia to analyze 3 data elements collected from service statistics: (1) number of contraceptive commodities distributed to clients, (2) number of family planning service visits, and (3) number of current contraceptive users. Data quality was assessed via analysis of mean square errors, using the United Nations Population Division World Contraceptive Use annual mCPR estimates as the "gold standard." We also examined the magnitude of several components of measurement error: (1) variance, (2) level bias, and (3) slope (or trend) bias. Our results indicate modest levels of tracking error for data on commodities to clients (7%) and service visits (10%), and somewhat higher error rates for data on current users (19%). Variance and slope bias were relatively small for all data elements. Level bias was by far the largest contributor to tracking error. Paired comparisons of data elements in countries that collected at least 2 of the 3 data elements indicated a modest advantage of data on commodities to clients. None of the data elements considered was sufficiently accurate to be used to produce reliable stand-alone annual estimates of mCPR. However, the relatively low levels of variance and slope bias indicate that trends calculated from these 3 data elements can be productively used in conjunction with the Family Planning Estimation Tool (FPET) currently used to produce annual mCPR tracking estimates for FP2020. © Magnani et al.

  20. Parameter Estimation in Astronomy with Poisson-Distributed Data. 1; The (CHI)2(gamma) Statistic

    NASA Technical Reports Server (NTRS)

    Mighell, Kenneth J.

    1999-01-01

    Applying the standard weighted mean formula, [Sigma (sub i)n(sub i)ssigma(sub i, sup -2)], to determine the weighted mean of data, n(sub i), drawn from a Poisson distribution, will, on average, underestimate the true mean by approx. 1 for all true mean values larger than approx.3 when the common assumption is made that the error of the i th observation is sigma(sub i) = max square root of n(sub i), 1).This small, but statistically significant offset, explains the long-known observation that chi-square minimization techniques which use the modified Neyman'chi(sub 2) statistic, chi(sup 2, sub N) equivalent Sigma(sub i)((n(sub i) - y(sub i)(exp 2)) / max(n(sub i), 1), to compare Poisson - distributed data with model values, y(sub i), will typically predict a total number of counts that underestimates the true total by about 1 count per bin. Based on my finding that weighted mean of data drawn from a Poisson distribution can be determined using the formula [Sigma(sub i)[n(sub i) + min(n(sub i), 1)](n(sub i) + 1)(exp -1)] / [Sigma(sub i)(n(sub i) + 1)(exp -1))], I propose that a new chi(sub 2) statistic, chi(sup 2, sub gamma) equivalent, should always be used to analyze Poisson- distributed data in preference to the modified Neyman's chi(exp 2) statistic. I demonstrated the power and usefulness of,chi(sub gamma, sup 2) minimization by using two statistical fitting techniques and five chi(exp 2) statistics to analyze simulated X-ray power - low 15 - channel spectra with large and small counts per bin. I show that chi(sub gamma, sup 2) minimization with the Levenberg - Marquardt or Powell's method can produce excellent results (mean slope errors approx. less than 3%) with spectra having as few as 25 total counts.

  1. Calculating the free energy of transfer of small solutes into a model lipid membrane: Comparison between metadynamics and umbrella sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bochicchio, Davide; Panizon, Emanuele; Ferrando, Riccardo

    2015-10-14

    We compare the performance of two well-established computational algorithms for the calculation of free-energy landscapes of biomolecular systems, umbrella sampling and metadynamics. We look at benchmark systems composed of polyethylene and polypropylene oligomers interacting with lipid (phosphatidylcholine) membranes, aiming at the calculation of the oligomer water-membrane free energy of transfer. We model our test systems at two different levels of description, united-atom and coarse-grained. We provide optimized parameters for the two methods at both resolutions. We devote special attention to the analysis of statistical errors in the two different methods and propose a general procedure for the error estimation inmore » metadynamics simulations. Metadynamics and umbrella sampling yield the same estimates for the water-membrane free energy profile, but metadynamics can be more efficient, providing lower statistical uncertainties within the same simulation time.« less

  2. Putting Meaning Back Into the Mean: A Comment on the Misuse of Elementary Statistics in a Sample of Manuscripts Submitted to Clinical Therapeutics.

    PubMed

    Forrester, Janet E

    2015-12-01

    Errors in the statistical presentation and analyses of data in the medical literature remain common despite efforts to improve the review process, including the creation of guidelines for authors and the use of statistical reviewers. This article discusses common elementary statistical errors seen in manuscripts recently submitted to Clinical Therapeutics and describes some ways in which authors and reviewers can identify errors and thus correct them before publication. A nonsystematic sample of manuscripts submitted to Clinical Therapeutics over the past year was examined for elementary statistical errors. Clinical Therapeutics has many of the same errors that reportedly exist in other journals. Authors require additional guidance to avoid elementary statistical errors and incentives to use the guidance. Implementation of reporting guidelines for authors and reviewers by journals such as Clinical Therapeutics may be a good approach to reduce the rate of statistical errors. Copyright © 2015 Elsevier HS Journals, Inc. All rights reserved.

  3. BATMAN: Bayesian Technique for Multi-image Analysis

    NASA Astrophysics Data System (ADS)

    Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.

    2017-04-01

    This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical data set containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (I.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real integral-field spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of 2. Our analysis reveals that the algorithm prioritizes conservation of all the statistically significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BaTMAn is not to be used as a 'black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.

  4. On the Calculation of Uncertainty Statistics with Error Bounds for CFD Calculations Containing Random Parameters and Fields

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2016-01-01

    This chapter discusses the ongoing development of combined uncertainty and error bound estimates for computational fluid dynamics (CFD) calculations subject to imposed random parameters and random fields. An objective of this work is the construction of computable error bound formulas for output uncertainty statistics that guide CFD practitioners in systematically determining how accurately CFD realizations should be approximated and how accurately uncertainty statistics should be approximated for output quantities of interest. Formal error bounds formulas for moment statistics that properly account for the presence of numerical errors in CFD calculations and numerical quadrature errors in the calculation of moment statistics have been previously presented in [8]. In this past work, hierarchical node-nested dense and sparse tensor product quadratures are used to calculate moment statistics integrals. In the present work, a framework has been developed that exploits the hierarchical structure of these quadratures in order to simplify the calculation of an estimate of the quadrature error needed in error bound formulas. When signed estimates of realization error are available, this signed error may also be used to estimate output quantity of interest probability densities as a means to assess the impact of realization error on these density estimates. Numerical results are presented for CFD problems with uncertainty to demonstrate the capabilities of this framework.

  5. A Study on Mutil-Scale Background Error Covariances in 3D-Var Data Assimilation

    NASA Astrophysics Data System (ADS)

    Zhang, Xubin; Tan, Zhe-Min

    2017-04-01

    The construction of background error covariances is a key component of three-dimensional variational data assimilation. There are different scale background errors and interactions among them in the numerical weather Prediction. However, the influence of these errors and their interactions cannot be represented in the background error covariances statistics when estimated by the leading methods. So, it is necessary to construct background error covariances influenced by multi-scale interactions among errors. With the NMC method, this article firstly estimates the background error covariances at given model-resolution scales. And then the information of errors whose scales are larger and smaller than the given ones is introduced respectively, using different nesting techniques, to estimate the corresponding covariances. The comparisons of three background error covariances statistics influenced by information of errors at different scales reveal that, the background error variances enhance particularly at large scales and higher levels when introducing the information of larger-scale errors by the lateral boundary condition provided by a lower-resolution model. On the other hand, the variances reduce at medium scales at the higher levels, while those show slight improvement at lower levels in the nested domain, especially at medium and small scales, when introducing the information of smaller-scale errors by nesting a higher-resolution model. In addition, the introduction of information of larger- (smaller-) scale errors leads to larger (smaller) horizontal and vertical correlation scales of background errors. Considering the multivariate correlations, the Ekman coupling increases (decreases) with the information of larger- (smaller-) scale errors included, whereas the geostrophic coupling in free atmosphere weakens in both situations. The three covariances obtained in above work are used in a data assimilation and model forecast system respectively, and then the analysis-forecast cycles for a period of 1 month are conducted. Through the comparison of both analyses and forecasts from this system, it is found that the trends for variation in analysis increments with information of different scale errors introduced are consistent with those for variation in variances and correlations of background errors. In particular, introduction of smaller-scale errors leads to larger amplitude of analysis increments for winds at medium scales at the height of both high- and low- level jet. And analysis increments for both temperature and humidity are greater at the corresponding scales at middle and upper levels under this circumstance. These analysis increments improve the intensity of jet-convection system which includes jets at different levels and coupling between them associated with latent heat release, and these changes in analyses contribute to the better forecasts for winds and temperature in the corresponding areas. When smaller-scale errors are included, analysis increments for humidity enhance significantly at large scales at lower levels to moisten southern analyses. This humidification devotes to correcting dry bias there and eventually improves forecast skill of humidity. Moreover, inclusion of larger- (smaller-) scale errors is beneficial for forecast quality of heavy (light) precipitation at large (small) scales due to the amplification (diminution) of intensity and area in precipitation forecasts but tends to overestimate (underestimate) light (heavy) precipitation .

  6. Evaluation of Methods Used for Estimating Selected Streamflow Statistics, and Flood Frequency and Magnitude, for Small Basins in North Coastal California

    USGS Publications Warehouse

    Mann, Michael P.; Rizzardo, Jule; Satkowski, Richard

    2004-01-01

    Accurate streamflow statistics are essential to water resource agencies involved in both science and decision-making. When long-term streamflow data are lacking at a site, estimation techniques are often employed to generate streamflow statistics. However, procedures for accurately estimating streamflow statistics often are lacking. When estimation procedures are developed, they often are not evaluated properly before being applied. Use of unevaluated or underevaluated flow-statistic estimation techniques can result in improper water-resources decision-making. The California State Water Resources Control Board (SWRCB) uses two key techniques, a modified rational equation and drainage basin area-ratio transfer, to estimate streamflow statistics at ungaged locations. These techniques have been implemented to varying degrees, but have not been formally evaluated. For estimating peak flows at the 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals, the SWRCB uses the U.S. Geological Surveys (USGS) regional peak-flow equations. In this study, done cooperatively by the USGS and SWRCB, the SWRCB estimated several flow statistics at 40 USGS streamflow gaging stations in the north coast region of California. The SWRCB estimates were made without reference to USGS flow data. The USGS used the streamflow data provided by the 40 stations to generate flow statistics that could be compared with SWRCB estimates for accuracy. While some SWRCB estimates compared favorably with USGS statistics, results were subject to varying degrees of error over the region. Flow-based estimation techniques generally performed better than rain-based methods, especially for estimation of December 15 to March 31 mean daily flows. The USGS peak-flow equations also performed well, but tended to underestimate peak flows. The USGS equations performed within reported error bounds, but will require updating in the future as peak-flow data sets grow larger. Little correlation was discovered between estimation errors and geographic locations or various basin characteristics. However, for 25-percentile year mean-daily-flow estimates for December 15 to March 31, the greatest estimation errors were at east San Francisco Bay area stations with mean annual precipitation less than or equal to 30 inches, and estimated 2-year/24-hour rainfall intensity less than 3 inches.

  7. Qualitative fusion technique based on information poor system and its application to factor analysis for vibration of rolling bearings

    NASA Astrophysics Data System (ADS)

    Xia, Xintao; Wang, Zhongyu

    2008-10-01

    For some methods of stability analysis of a system using statistics, it is difficult to resolve the problems of unknown probability distribution and small sample. Therefore, a novel method is proposed in this paper to resolve these problems. This method is independent of probability distribution, and is useful for small sample systems. After rearrangement of the original data series, the order difference and two polynomial membership functions are introduced to estimate the true value, the lower bound and the supper bound of the system using fuzzy-set theory. Then empirical distribution function is investigated to ensure confidence level above 95%, and the degree of similarity is presented to evaluate stability of the system. Cases of computer simulation investigate stable systems with various probability distribution, unstable systems with linear systematic errors and periodic systematic errors and some mixed systems. The method of analysis for systematic stability is approved.

  8. Measurement time and statistics for a noise thermometer with a synthetic-noise reference

    NASA Astrophysics Data System (ADS)

    White, D. R.; Benz, S. P.; Labenski, J. R.; Nam, S. W.; Qu, J. F.; Rogalla, H.; Tew, W. L.

    2008-08-01

    This paper describes methods for reducing the statistical uncertainty in measurements made by noise thermometers using digital cross-correlators and, in particular, for thermometers using pseudo-random noise for the reference signal. First, a discrete-frequency expression for the correlation bandwidth for conventional noise thermometers is derived. It is shown how an alternative frequency-domain computation can be used to eliminate the spectral response of the correlator and increase the correlation bandwidth. The corresponding expressions for the uncertainty in the measurement of pseudo-random noise in the presence of uncorrelated thermal noise are then derived. The measurement uncertainty in this case is less than that for true thermal-noise measurements. For pseudo-random sources generating a frequency comb, an additional small reduction in uncertainty is possible, but at the cost of increasing the thermometer's sensitivity to non-linearity errors. A procedure is described for allocating integration times to further reduce the total uncertainty in temperature measurements. Finally, an important systematic error arising from the calculation of ratios of statistical variables is described.

  9. A Monte Carlo Study of Levene's Test of Homogeneity of Variance: Empirical Frequencies of Type I Error in Normal Distributions.

    ERIC Educational Resources Information Center

    Neel, John H.; Stallings, William M.

    An influential statistics test recommends a Levene text for homogeneity of variance. A recent note suggests that Levene's test is upwardly biased for small samples. Another report shows inflated Alpha estimates and low power. Neither study utilized more than two sample sizes. This Monte Carlo study involved sampling from a normal population for…

  10. Errors in statistical decision making Chapter 2 in Applied Statistics in Agricultural, Biological, and Environmental Sciences

    USDA-ARS?s Scientific Manuscript database

    Agronomic and Environmental research experiments result in data that are analyzed using statistical methods. These data are unavoidably accompanied by uncertainty. Decisions about hypotheses, based on statistical analyses of these data are therefore subject to error. This error is of three types,...

  11. Combined Uncertainty and A-Posteriori Error Bound Estimates for General CFD Calculations: Theory and Software Implementation

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2014-01-01

    This workshop presentation discusses the design and implementation of numerical methods for the quantification of statistical uncertainty, including a-posteriori error bounds, for output quantities computed using CFD methods. Hydrodynamic realizations often contain numerical error arising from finite-dimensional approximation (e.g. numerical methods using grids, basis functions, particles) and statistical uncertainty arising from incomplete information and/or statistical characterization of model parameters and random fields. The first task at hand is to derive formal error bounds for statistics given realizations containing finite-dimensional numerical error [1]. The error in computed output statistics contains contributions from both realization error and the error resulting from the calculation of statistics integrals using a numerical method. A second task is to devise computable a-posteriori error bounds by numerically approximating all terms arising in the error bound estimates. For the same reason that CFD calculations including error bounds but omitting uncertainty modeling are only of limited value, CFD calculations including uncertainty modeling but omitting error bounds are only of limited value. To gain maximum value from CFD calculations, a general software package for uncertainty quantification with quantified error bounds has been developed at NASA. The package provides implementations for a suite of numerical methods used in uncertainty quantification: Dense tensorization basis methods [3] and a subscale recovery variant [1] for non-smooth data, Sparse tensorization methods[2] utilizing node-nested hierarchies, Sampling methods[4] for high-dimensional random variable spaces.

  12. Effect of cephalometer misalignment on calculations of facial asymmetry.

    PubMed

    Lee, Ki-Heon; Hwang, Hyeon-Shik; Curry, Sean; Boyd, Robert L; Norris, Kevin; Baumrind, Sheldon

    2007-07-01

    In this study, we evaluated errors introduced into the interpretation of facial asymmetry on posteroanterior (PA) cephalograms due to malpositioning of the x-ray emitter focal spot. We tested the hypothesis that horizontal displacements of the emitter from its ideal position would produce systematic displacements of skull landmarks that could be fully accounted for by the rules of projective geometry alone. A representative dry skull with 22 metal markers was used to generate a series of PA images from different emitter positions by using a fully calibrated stereo cephalometer. Empirical measurements of the resulting cephalograms were compared with mathematical predictions based solely on geometric rules. The empirical measurements matched the mathematical predictions within the limits of measurement error (x= 0.23 mm), thus supporting the hypothesis. Based upon this finding, we generated a completely symmetrical mathematical skull and calculated the expected errors for focal spots of several different magnitudes. Quantitative data were computed for focal spot displacements of different magnitudes. Misalignment of the x-ray emitter focal spot introduces systematic errors into the interpretation of facial asymmetry on PA cephalograms. For misalignments of less than 20 mm, the effect is small in individual cases. However, misalignments as small as 10 mm can introduce spurious statistical findings of significant asymmetry when mean values for large groups of PA images are evaluated.

  13. Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

    PubMed

    Sztepanacz, Jacqueline L; Blows, Mark W

    2017-07-01

    The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.

  14. Accounting for model error in Bayesian solutions to hydrogeophysical inverse problems using a local basis approach

    NASA Astrophysics Data System (ADS)

    Irving, J.; Koepke, C.; Elsheikh, A. H.

    2017-12-01

    Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward process model linking subsurface parameters to measured data, which is typically assumed to be known perfectly in the inversion procedure. However, in order to make the stochastic solution of the inverse problem computationally tractable using, for example, Markov-chain-Monte-Carlo (MCMC) methods, fast approximations of the forward model are commonly employed. This introduces model error into the problem, which has the potential to significantly bias posterior statistics and hamper data integration efforts if not properly accounted for. Here, we present a new methodology for addressing the issue of model error in Bayesian solutions to hydrogeophysical inverse problems that is geared towards the common case where these errors cannot be effectively characterized globally through some parametric statistical distribution or locally based on interpolation between a small number of computed realizations. Rather than focusing on the construction of a global or local error model, we instead work towards identification of the model-error component of the residual through a projection-based approach. In this regard, pairs of approximate and detailed model runs are stored in a dictionary that grows at a specified rate during the MCMC inversion procedure. At each iteration, a local model-error basis is constructed for the current test set of model parameters using the K-nearest neighbour entries in the dictionary, which is then used to separate the model error from the other error sources before computing the likelihood of the proposed set of model parameters. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar traveltime data for three different subsurface parameterizations of varying complexity. The synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed in the inversion procedure. In each case, the developed model-error approach enables to remove posterior bias and obtain a more realistic characterization of uncertainty.

  15. Dealing with dietary measurement error in nutritional cohort studies.

    PubMed

    Freedman, Laurence S; Schatzkin, Arthur; Midthune, Douglas; Kipnis, Victor

    2011-07-20

    Dietary measurement error creates serious challenges to reliably discovering new diet-disease associations in nutritional cohort studies. Such error causes substantial underestimation of relative risks and reduction of statistical power for detecting associations. On the basis of data from the Observing Protein and Energy Nutrition Study, we recommend the following approaches to deal with these problems. Regarding data analysis of cohort studies using food-frequency questionnaires, we recommend 1) using energy adjustment for relative risk estimation; 2) reporting estimates adjusted for measurement error along with the usual relative risk estimates, whenever possible (this requires data from a relevant, preferably internal, validation study in which participants report intakes using both the main instrument and a more detailed reference instrument such as a 24-hour recall or multiple-day food record); 3) performing statistical adjustment of relative risks, based on such validation data, if they exist, using univariate (only for energy-adjusted intakes such as densities or residuals) or multivariate regression calibration. We note that whereas unadjusted relative risk estimates are biased toward the null value, statistical significance tests of unadjusted relative risk estimates are approximately valid. Regarding study design, we recommend increasing the sample size to remedy loss of power; however, it is important to understand that this will often be an incomplete solution because the attenuated signal may be too small to distinguish from unmeasured confounding in the model relating disease to reported intake. Future work should be devoted to alleviating the problem of signal attenuation, possibly through the use of improved self-report instruments or by combining dietary biomarkers with self-report instruments.

  16. The Effect of Photon Statistics and Pulse Shaping on the Performance of the Wiener Filter Crystal Identification Algorithm Applied to LabPET Phoswich Detectors

    NASA Astrophysics Data System (ADS)

    Yousefzadeh, Hoorvash Camilia; Lecomte, Roger; Fontaine, Réjean

    2012-06-01

    A fast Wiener filter-based crystal identification (WFCI) algorithm was recently developed to discriminate crystals with close scintillation decay times in phoswich detectors. Despite the promising performance of WFCI, the influence of various physical factors and electrical noise sources of the data acquisition chain (DAQ) on the crystal identification process was not fully investigated. This paper examines the effect of different noise sources, such as photon statistics, avalanche photodiode (APD) excess multiplication noise, and front-end electronic noise, as well as the influence of different shaping filters on the performance of the WFCI algorithm. To this end, a PET-like signal simulator based on a model of the LabPET DAQ, a small animal APD-based digital PET scanner, was developed. Simulated signals were generated under various noise conditions with CR-RC shapers of order 1, 3, and 5 having different time constants (τ). Applying the WFCI algorithm to these simulated signals showed that the non-stationary Poisson photon statistics is the main contributor to the identification error of WFCI algorithm. A shaping filter of order 1 with τ = 50 ns yielded the best WFCI performance (error 1%), while a longer shaping time of τ = 100 ns slightly degraded the WFCI performance (error 3%). Filters of higher orders with fast shaping time constants (10-33 ns) also produced good WFCI results (error 1.4% to 1.6%). This study shows the advantage of the pulse simulator in evaluating various DAQ conditions and confirms the influence of the detection chain on the WFCI performance.

  17. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  18. Accounting for model error in Bayesian solutions to hydrogeophysical inverse problems using a local basis approach

    NASA Astrophysics Data System (ADS)

    Köpke, Corinna; Irving, James; Elsheikh, Ahmed H.

    2018-06-01

    Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward model linking subsurface physical properties to measured data, which is typically assumed to be perfectly known in the inversion procedure. However, to make the stochastic solution of the inverse problem computationally tractable using methods such as Markov-chain-Monte-Carlo (MCMC), fast approximations of the forward model are commonly employed. This gives rise to model error, which has the potential to significantly bias posterior statistics if not properly accounted for. Here, we present a new methodology for dealing with the model error arising from the use of approximate forward solvers in Bayesian solutions to hydrogeophysical inverse problems. Our approach is geared towards the common case where this error cannot be (i) effectively characterized through some parametric statistical distribution; or (ii) estimated by interpolating between a small number of computed model-error realizations. To this end, we focus on identification and removal of the model-error component of the residual during MCMC using a projection-based approach, whereby the orthogonal basis employed for the projection is derived in each iteration from the K-nearest-neighboring entries in a model-error dictionary. The latter is constructed during the inversion and grows at a specified rate as the iterations proceed. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar travel-time data considering three different subsurface parameterizations of varying complexity. Synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed for their inversion. In each case, our developed approach enables us to remove posterior bias and obtain a more realistic characterization of uncertainty.

  19. Statistics of the epoch of reionization 21-cm signal - I. Power spectrum error-covariance

    NASA Astrophysics Data System (ADS)

    Mondal, Rajesh; Bharadwaj, Somnath; Majumdar, Suman

    2016-02-01

    The non-Gaussian nature of the epoch of reionization (EoR) 21-cm signal has a significant impact on the error variance of its power spectrum P(k). We have used a large ensemble of seminumerical simulations and an analytical model to estimate the effect of this non-Gaussianity on the entire error-covariance matrix {C}ij. Our analytical model shows that {C}ij has contributions from two sources. One is the usual variance for a Gaussian random field which scales inversely of the number of modes that goes into the estimation of P(k). The other is the trispectrum of the signal. Using the simulated 21-cm Signal Ensemble, an ensemble of the Randomized Signal and Ensembles of Gaussian Random Ensembles we have quantified the effect of the trispectrum on the error variance {C}II. We find that its relative contribution is comparable to or larger than that of the Gaussian term for the k range 0.3 ≤ k ≤ 1.0 Mpc-1, and can be even ˜200 times larger at k ˜ 5 Mpc-1. We also establish that the off-diagonal terms of {C}ij have statistically significant non-zero values which arise purely from the trispectrum. This further signifies that the error in different k modes are not independent. We find a strong correlation between the errors at large k values (≥0.5 Mpc-1), and a weak correlation between the smallest and largest k values. There is also a small anticorrelation between the errors in the smallest and intermediate k values. These results are relevant for the k range that will be probed by the current and upcoming EoR 21-cm experiments.

  20. Effect of the image resolution on the statistical descriptors of heterogeneous media.

    PubMed

    Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime

    2018-02-01

    The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.

  1. Effect of the image resolution on the statistical descriptors of heterogeneous media

    NASA Astrophysics Data System (ADS)

    Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime

    2018-02-01

    The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.

  2. Towards Accurate Modelling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.

    2018-04-01

    Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter halos. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the "accurate" regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard ΛCDM + halo model against the clustering of SDSS DR7 galaxies. Specifically, we use the projected correlation function, group multiplicity function and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir halos) matches the clustering of low luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the "standard" halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.

  3. Running coupling constant from lattice studies of gluon and ghost propagators

    NASA Astrophysics Data System (ADS)

    Cucchieri, A.; Mendes, T.

    2004-12-01

    We present a numerical study of the running coupling constant in four-dimensional pure-SU(2) lattice gauge theory. The running coupling is evaluated by fitting data for the gluon and ghost propagators in minimal Landau gauge. Following Refs. [1, 2], the fitting formulae are obtained by a simultaneous integration of the β function and of a function coinciding with the anomalous dimension of the propagator in the momentum subtraction scheme. We consider these formulae at three and four loops. The fitting method works well, especially for the ghost case, for which statistical error and hyper-cubic effects are very small. Our present result for ΛMS is 200-40+60 MeV, where the error is purely systematic. We are currently extending this analysis to five loops in order to reduce this systematic error.

  4. Application of Intra-Oral Dental Scanners in the Digital Workflow of Implantology

    PubMed Central

    van der Meer, Wicher J.; Andriessen, Frank S.; Wismeijer, Daniel; Ren, Yijin

    2012-01-01

    Intra-oral scanners will play a central role in digital dentistry in the near future. In this study the accuracy of three intra-oral scanners was compared. Materials and methods: A master model made of stone was fitted with three high precision manufactured PEEK cylinders and scanned with three intra-oral scanners: the CEREC (Sirona), the iTero (Cadent) and the Lava COS (3M). In software the digital files were imported and the distance between the centres of the cylinders and the angulation between the cylinders was assessed. These values were compared to the measurements made on a high accuracy 3D scan of the master model. Results: The distance errors were the smallest and most consistent for the Lava COS. The distance errors for the Cerec were the largest and least consistent. All the angulation errors were small. Conclusions: The Lava COS in combination with a high accuracy scanning protocol resulted in the smallest and most consistent errors of all three scanners tested when considering mean distance errors in full arch impressions both in absolute values and in consistency for both measured distances. For the mean angulation errors, the Lava COS had the smallest errors between cylinders 1–2 and the largest errors between cylinders 1–3, although the absolute difference with the smallest mean value (iTero) was very small (0,0529°). An expected increase in distance and/or angular errors over the length of the arch due to an accumulation of registration errors of the patched 3D surfaces could be observed in this study design, but the effects were statistically not significant. Clinical relevance For making impressions of implant cases for digital workflows, the most accurate scanner with the scanning protocol that will ensure the most accurate digital impression should be used. In our study model that was the Lava COS with the high accuracy scanning protocol. PMID:22937030

  5. Moments of inclination error distribution computer program

    NASA Technical Reports Server (NTRS)

    Myler, T. R.

    1981-01-01

    A FORTRAN coded computer program is described which calculates orbital inclination error statistics using a closed-form solution. This solution uses a data base of trajectory errors from actual flights to predict the orbital inclination error statistics. The Scott flight history data base consists of orbit insertion errors in the trajectory parameters - altitude, velocity, flight path angle, flight azimuth, latitude and longitude. The methods used to generate the error statistics are of general interest since they have other applications. Program theory, user instructions, output definitions, subroutine descriptions and detailed FORTRAN coding information are included.

  6. Scaled test statistics and robust standard errors for non-normal data in covariance structure analysis: a Monte Carlo study.

    PubMed

    Chou, C P; Bentler, P M; Satorra, A

    1991-11-01

    Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.

  7. Generic Feature Selection with Short Fat Data

    PubMed Central

    Clarke, B.; Chu, J.-H.

    2014-01-01

    SUMMARY Consider a regression problem in which there are many more explanatory variables than data points, i.e., p ≫ n. Essentially, without reducing the number of variables inference is impossible. So, we group the p explanatory variables into blocks by clustering, evaluate statistics on the blocks and then regress the response on these statistics under a penalized error criterion to obtain estimates of the regression coefficients. We examine the performance of this approach for a variety of choices of n, p, classes of statistics, clustering algorithms, penalty terms, and data types. When n is not large, the discrimination over number of statistics is weak, but computations suggest regressing on approximately [n/K] statistics where K is the number of blocks formed by a clustering algorithm. Small deviations from this are observed when the blocks of variables are of very different sizes. Larger deviations are observed when the penalty term is an Lq norm with high enough q. PMID:25346546

  8. Model Error Estimation for the CPTEC Eta Model

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; daSilva, Arlindo

    1999-01-01

    Statistical data assimilation systems require the specification of forecast and observation error statistics. Forecast error is due to model imperfections and differences between the initial condition and the actual state of the atmosphere. Practical four-dimensional variational (4D-Var) methods try to fit the forecast state to the observations and assume that the model error is negligible. Here with a number of simplifying assumption, a framework is developed for isolating the model error given the forecast error at two lead-times. Two definitions are proposed for the Talagrand ratio tau, the fraction of the forecast error due to model error rather than initial condition error. Data from the CPTEC Eta Model running operationally over South America are used to calculate forecast error statistics and lower bounds for tau.

  9. Improved Statistics for Genome-Wide Interaction Analysis

    PubMed Central

    Ueki, Masao; Cordell, Heather J.

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670

  10. Adapt-Mix: learning local genetic correlation structure improves summary statistics-based analyses

    PubMed Central

    Park, Danny S.; Brown, Brielin; Eng, Celeste; Huntsman, Scott; Hu, Donglei; Torgerson, Dara G.; Burchard, Esteban G.; Zaitlen, Noah

    2015-01-01

    Motivation: Approaches to identifying new risk loci, training risk prediction models, imputing untyped variants and fine-mapping causal variants from summary statistics of genome-wide association studies are playing an increasingly important role in the human genetics community. Current summary statistics-based methods rely on global ‘best guess’ reference panels to model the genetic correlation structure of the dataset being studied. This approach, especially in admixed populations, has the potential to produce misleading results, ignores variation in local structure and is not feasible when appropriate reference panels are missing or small. Here, we develop a method, Adapt-Mix, that combines information across all available reference panels to produce estimates of local genetic correlation structure for summary statistics-based methods in arbitrary populations. Results: We applied Adapt-Mix to estimate the genetic correlation structure of both admixed and non-admixed individuals using simulated and real data. We evaluated our method by measuring the performance of two summary statistics-based methods: imputation and joint-testing. When using our method as opposed to the current standard of ‘best guess’ reference panels, we observed a 28% decrease in mean-squared error for imputation and a 73.7% decrease in mean-squared error for joint-testing. Availability and implementation: Our method is publicly available in a software package called ADAPT-Mix available at https://github.com/dpark27/adapt_mix. Contact: noah.zaitlen@ucsf.edu PMID:26072481

  11. Improved techniques for predicting spacecraft power

    NASA Technical Reports Server (NTRS)

    Chmielewski, A. B.

    1987-01-01

    Radioisotope Thermoelectric Generators (RTGs) are going to supply power for the NASA Galileo and Ulysses spacecraft now scheduled to be launched in 1989 and 1990. The duration of the Galileo mission is expected to be over 8 years. This brings the total RTG lifetime to 13 years. In 13 years, the RTG power drops more than 20 percent leaving a very small power margin over what is consumed by the spacecraft. Thus it is very important to accurately predict the RTG performance and be able to assess the magnitude of errors involved. The paper lists all the error sources involved in the RTG power predictions and describes a statistical method for calculating the tolerance.

  12. Probabilistic global maps of the CO2 column at daily and monthly scales from sparse satellite measurements

    NASA Astrophysics Data System (ADS)

    Chevallier, Frédéric; Broquet, Grégoire; Pierangelo, Clémence; Crisp, David

    2017-07-01

    The column-average dry air-mole fraction of carbon dioxide in the atmosphere (XCO2) is measured by scattered satellite measurements like those from the Orbiting Carbon Observatory (OCO-2). We show that global continuous maps of XCO2 (corresponding to level 3 of the satellite data) at daily or coarser temporal resolution can be inferred from these data with a Kalman filter built on a model of persistence. Our application of this approach on 2 years of OCO-2 retrievals indicates that the filter provides better information than a climatology of XCO2 at both daily and monthly scales. Provided that the assigned observation uncertainty statistics are tuned in each grid cell of the XCO2 maps from an objective method (based on consistency diagnostics), the errors predicted by the filter at daily and monthly scales represent the true error statistics reasonably well, except for a bias in the high latitudes of the winter hemisphere and a lack of resolution (i.e., a too small discrimination skill) of the predicted error standard deviations. Due to the sparse satellite sampling, the broad-scale patterns of XCO2 described by the filter seem to lag behind the real signals by a few weeks. Finally, the filter offers interesting insights into the quality of the retrievals, both in terms of random and systematic errors.

  13. Impact of numerical choices on water conservation in the E3SM Atmosphere Model Version 1 (EAM V1)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.

    The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations for sea level rise projection. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods formore » fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model is negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in the new model results in a very thin model layer at the Earth’s surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for this model.« less

  14. Impact of numerical choices on water conservation in the E3SM Atmosphere Model version 1 (EAMv1)

    NASA Astrophysics Data System (ADS)

    Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.; Wan, Hui; Leung, Ruby; Ma, Po-Lun; Golaz, Jean-Christophe; Wolfe, Jon; Lin, Wuyin; Singh, Balwinder; Burrows, Susannah; Yoon, Jin-Ho; Wang, Hailong; Qian, Yun; Tang, Qi; Caldwell, Peter; Xie, Shaocheng

    2018-06-01

    The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods for fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model becomes negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors in early V1 versions decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in V1 results in a very thin model layer at the Earth's surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for V1.

  15. Bayesian generalized least squares regression with application to log Pearson type 3 regional skew estimation

    NASA Astrophysics Data System (ADS)

    Reis, D. S.; Stedinger, J. R.; Martins, E. S.

    2005-10-01

    This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.

  16. Selected Streamflow Statistics and Regression Equations for Predicting Statistics at Stream Locations in Monroe County, Pennsylvania

    USGS Publications Warehouse

    Thompson, Ronald E.; Hoffman, Scott A.

    2006-01-01

    A suite of 28 streamflow statistics, ranging from extreme low to high flows, was computed for 17 continuous-record streamflow-gaging stations and predicted for 20 partial-record stations in Monroe County and contiguous counties in north-eastern Pennsylvania. The predicted statistics for the partial-record stations were based on regression analyses relating inter-mittent flow measurements made at the partial-record stations indexed to concurrent daily mean flows at continuous-record stations during base-flow conditions. The same statistics also were predicted for 134 ungaged stream locations in Monroe County on the basis of regression analyses relating the statistics to GIS-determined basin characteristics for the continuous-record station drainage areas. The prediction methodology for developing the regression equations used to estimate statistics was developed for estimating low-flow frequencies. This study and a companion study found that the methodology also has application potential for predicting intermediate- and high-flow statistics. The statistics included mean monthly flows, mean annual flow, 7-day low flows for three recurrence intervals, nine flow durations, mean annual base flow, and annual mean base flows for two recurrence intervals. Low standard errors of prediction and high coefficients of determination (R2) indicated good results in using the regression equations to predict the statistics. Regression equations for the larger flow statistics tended to have lower standard errors of prediction and higher coefficients of determination (R2) than equations for the smaller flow statistics. The report discusses the methodologies used in determining the statistics and the limitations of the statistics and the equations used to predict the statistics. Caution is indicated in using the predicted statistics for small drainage area situations. Study results constitute input needed by water-resource managers in Monroe County for planning purposes and evaluation of water-resources availability.

  17. Testing for Granger Causality in the Frequency Domain: A Phase Resampling Method.

    PubMed

    Liu, Siwei; Molenaar, Peter

    2016-01-01

    This article introduces phase resampling, an existing but rarely used surrogate data method for making statistical inferences of Granger causality in frequency domain time series analysis. Granger causality testing is essential for establishing causal relations among variables in multivariate dynamic processes. However, testing for Granger causality in the frequency domain is challenging due to the nonlinear relation between frequency domain measures (e.g., partial directed coherence, generalized partial directed coherence) and time domain data. Through a simulation study, we demonstrate that phase resampling is a general and robust method for making statistical inferences even with short time series. With Gaussian data, phase resampling yields satisfactory type I and type II error rates in all but one condition we examine: when a small effect size is combined with an insufficient number of data points. Violations of normality lead to slightly higher error rates but are mostly within acceptable ranges. We illustrate the utility of phase resampling with two empirical examples involving multivariate electroencephalography (EEG) and skin conductance data.

  18. Hospital inpatient self-administration of medicine programmes: a critical literature review.

    PubMed

    Wright, Julia; Emerson, Angela; Stephens, Martin; Lennan, Elaine

    2006-06-01

    The Department of Health, pharmaceutical and nursing bodies have advocated the benefits of self-administration programmes (SAPs), but their implementation within UK hospitals has been limited. Perceived barriers are: anticipated increased workload, insufficient resources and patient safety concerns. This review aims to discover if benefits of SAPs are supported in the literature in relation to risk and resource implications. Electronic databases were searched up to March 2004. Published English language articles that described and evaluated implementation of an SAP were included. Outcomes reported were: compliance measures, errors, knowledge, patient satisfaction, and nursing and pharmacy time. Most of the 51 papers reviewed had methodological flaws. SAPs varied widely in content and structure. Twelve studies (10 controlled) measured compliance by tablet counts. Of 7 studies subjected to statistical analysis, four demonstrated a significant difference in compliance between SAP and controls. Eight studies (5 controlled) measured errors as an outcome. Of the two evaluated statistically, only one demonstrated significantly fewer medication errors in the SAP group than in controls. Seventeen papers (11 controlled) studied the effect of SAPs on patients' medication knowledge. Ten of the 11 statistically analysed studies showed that SAP participants knew significantly more about some aspects of their medication than did controls. Seventeen studies (5 controlled), measured patient satisfaction. Two studies were statistically analysed and these studies suggested that patients were satisfied and preferred SAP. Seven papers studied pharmacy time, three studied nursing time but results were not compared to controls. The paucity of well-designed studies, flawed methodology and inadequate reporting in many papers make conclusions hard to draw. Conclusive evidence that SAPs improve compliance was not provided. Although patients participating in SAPs make errors, small numbers of patients are often responsible for a large number of errors. Whilst most studies suggest that SAPs increase patient's knowledge in part, it is difficult to separate out the effect of the educational component of many SAPs. Most patients who participated in SAPs were satisfied with their care and many would choose to take part in a SAP in the future. No studies measured the total resource requirement of implementing and maintaining a SAP.

  19. Relationship between cutoff frequency and accuracy in time-interval photon statistics applied to oscillating signals

    NASA Astrophysics Data System (ADS)

    Rebolledo, M. A.; Martinez-Betorz, J. A.

    1989-04-01

    In this paper the accuracy in the determination of the period of an oscillating signal, when obtained from the photon statistics time-interval probability, is studied as a function of the precision (the inverse of the cutoff frequency of the photon counting system) with which time intervals are measured. The results are obtained by means of an experiment with a square-wave signal, where the Fourier or square-wave transforms of the time-interval probability are measured. It is found that for values of the frequency of the signal near the cutoff frequency the errors in the period are small.

  20. Refractive errors in patients with newly diagnosed diabetes mellitus.

    PubMed

    Yarbağ, Abdülhekim; Yazar, Hayrullah; Akdoğan, Mehmet; Pekgör, Ahmet; Kaleli, Suleyman

    2015-01-01

    Diabetes mellitus is a complex metabolic disorder that involves the small blood vessels, often causing widespread damage to tissues, including the eyes' optic refractive error. In patients with newly diagnosed diabetes mellitus who have unstable blood glucose levels, refraction may be incorrect. We aimed to investigate refraction in patients who were recently diagnosed with diabetes and treated at our centre. This prospective study was performed from February 2013 to January 2014. Patients were diagnosed with diabetes mellitus using laboratory biochemical tests and clinical examination. Venous fasting plasma glucose (fpg) levels were measured along with refractive errors. Two measurements were taken: initially and after four weeks. The last difference between the initial and end refractive measurements were evaluated. Our patients were 100 males and 30 females who had been newly diagnosed with type II DM. The refractive and fpg levels were measured twice in all patients. The average values of the initial measurements were as follows: fpg level, 415 mg/dl; average refractive value, +2.5 D (Dioptres). The average end of period measurements were fpg, 203 mg/dl; average refractive value, +0.75 D. There is a statistically significant difference between after four weeks measurements with initially measurements of fasting plasma glucose (fpg) levels (p<0.05) and there is a statistically significant relationship between changes in fpg changes with glasses ID (p<0.05) and the disappearance of blurred vision (to be greater than 50% success rate) were statistically significant (p<0.05). Also, were detected upon all these results the absence of any age and sex effects (p>0.05). Refractive error is affected in patients with newly diagnosed diabetes mellitus; therefore, plasma glucose levels should be considered in the selection of glasses.

  1. Satellite Sampling and Retrieval Errors in Regional Monthly Rain Estimates from TMI AMSR-E, SSM/I, AMSU-B and the TRMM PR

    NASA Technical Reports Server (NTRS)

    Fisher, Brad; Wolff, David B.

    2010-01-01

    Passive and active microwave rain sensors onboard earth-orbiting satellites estimate monthly rainfall from the instantaneous rain statistics collected during satellite overpasses. It is well known that climate-scale rain estimates from meteorological satellites incur sampling errors resulting from the process of discrete temporal sampling and statistical averaging. Sampling and retrieval errors ultimately become entangled in the estimation of the mean monthly rain rate. The sampling component of the error budget effectively introduces statistical noise into climate-scale rain estimates that obscure the error component associated with the instantaneous rain retrieval. Estimating the accuracy of the retrievals on monthly scales therefore necessitates a decomposition of the total error budget into sampling and retrieval error quantities. This paper presents results from a statistical evaluation of the sampling and retrieval errors for five different space-borne rain sensors on board nine orbiting satellites. Using an error decomposition methodology developed by one of the authors, sampling and retrieval errors were estimated at 0.25 resolution within 150 km of ground-based weather radars located at Kwajalein, Marshall Islands and Melbourne, Florida. Error and bias statistics were calculated according to the land, ocean and coast classifications of the surface terrain mask developed for the Goddard Profiling (GPROF) rain algorithm. Variations in the comparative error statistics are attributed to various factors related to differences in the swath geometry of each rain sensor, the orbital and instrument characteristics of the satellite and the regional climatology. The most significant result from this study found that each of the satellites incurred negative longterm oceanic retrieval biases of 10 to 30%.

  2. Statistical Analysis of Hit/Miss Data (Preprint)

    DTIC Science & Technology

    2012-07-01

    HDBK-1823A, 2009). Other agencies and industries have also made use of this guidance (Gandossi et al., 2010) and ( Drury et al., 2006). It should...better accounting of false call rates such that the POD curve doesn’t converge to 0 for small flaw sizes. The difficulty with conventional methods...2002. Drury , Ghylin, and Holness, Error Analysis and Threat Magnitude for Carry-on Bag Inspection, Proceedings of the Human Factors and Ergonomic

  3. Applicability of Infrared Photorefraction for Measurement of Accommodation in Awake-Behaving Normal and Strabismic Monkeys

    PubMed Central

    Bossong, Heather; Swann, Michelle; Glasser, Adrian; Das, Vallabh E.

    2010-01-01

    Purpose This study was designed to use infrared photorefraction to measure accommodation in awake-behaving normal and strabismic monkeys and describe properties of photorefraction calibrations in these monkeys. Methods Ophthalmic trial lenses were used to calibrate the slope of pupil vertical pixel intensity profile measurements that were made with a custom-built infrared photorefractor. Day to day variability in photorefraction calibration curves, variability in calibration coefficients due to misalignment of the photorefractor Purkinje image and the center of the pupil, and variability in refractive error due to off-axis measurements were evaluated. Results The linear range of calibration of the photorefractor was found for ophthalmic lenses ranging from –1 D to +4 D. Calibration coefficients were different across monkeys tested (two strabismic, one normal) but were similar for each monkey over different experimental days. In both normal and strabismic monkeys, small misalignment of the photorefractor Purkinje image with the center of pupil resulted in only small changes in calibration coefficients, that were not statistically significant (P > 0.05). Off-axis measurement of refractive error was also small in the normal and strabismic monkeys (~1 D to 2 D) as long as the magnitude of misalignment was <10°. Conclusions Remote infrared photorefraction is suitable for measuring accommodation in awake, behaving normal, and strabismic monkeys. Specific challenges posed by the strabismic monkeys, such as possible misalignment of the photorefractor Purkinje image and the center of the pupil during either calibration or measurement of accommodation, that may arise due to unsteady fixation or small eye movements including nystagmus, results in small changes in measured refractive error. PMID:19029024

  4. Estimation of flood-frequency characteristics of small urban streams in North Carolina

    USGS Publications Warehouse

    Robbins, J.C.; Pope, B.F.

    1996-01-01

    A statewide study was conducted to develop methods for estimating the magnitude and frequency of floods of small urban streams in North Carolina. This type of information is critical in the design of bridges, culverts and water-control structures, establishment of flood-insurance rates and flood-plain regulation, and for other uses by urban planners and engineers. Concurrent records of rainfall and runoff data collected in small urban basins were used to calibrate rainfall-runoff models. Historic rain- fall records were used with the calibrated models to synthesize a long- term record of annual peak discharges. The synthesized record of annual peak discharges were used in a statistical analysis to determine flood- frequency distributions. These frequency distributions were used with distributions from previous investigations to develop a database for 32 small urban basins in the Blue Ridge-Piedmont, Sand Hills, and Coastal Plain hydrologic areas. The study basins ranged in size from 0.04 to 41.0 square miles. Data describing the size and shape of the basin, level of urban development, and climate and rural flood charac- teristics also were included in the database. Estimation equations were developed by relating flood-frequency char- acteristics to basin characteristics in a generalized least-squares regression analysis. The most significant basin characteristics are drainage area, impervious area, and rural flood discharge. The model error and prediction errors for the estimating equations were less than those for the national flood-frequency equations previously reported. Resulting equations, which have prediction errors generally less than 40 percent, can be used to estimate flood-peak discharges for 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals for small urban basins across the State assuming negligible, sustainable, in- channel detention or basin storage.

  5. Towards accurate modelling of galaxy clustering on small scales: testing the standard ΛCDM + halo model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.

    2018-07-01

    Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter haloes. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the `accurate' regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard Λ cold dark matter (ΛCDM) + halo model against the clustering of Sloan Digital Sky Survey (SDSS) seventh data release (DR7) galaxies. Specifically, we use the projected correlation function, group multiplicity function, and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir haloes) matches the clustering of low-luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the `standard' halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.

  6. Inference With Difference-in-Differences With a Small Number of Groups: A Review, Simulation Study, and Empirical Application Using SHARE Data.

    PubMed

    Rokicki, Slawa; Cohen, Jessica; Fink, Günther; Salomon, Joshua A; Landrum, Mary Beth

    2018-01-01

    Difference-in-differences (DID) estimation has become increasingly popular as an approach to evaluate the effect of a group-level policy on individual-level outcomes. Several statistical methodologies have been proposed to correct for the within-group correlation of model errors resulting from the clustering of data. Little is known about how well these corrections perform with the often small number of groups observed in health research using longitudinal data. First, we review the most commonly used modeling solutions in DID estimation for panel data, including generalized estimating equations (GEE), permutation tests, clustered standard errors (CSE), wild cluster bootstrapping, and aggregation. Second, we compare the empirical coverage rates and power of these methods using a Monte Carlo simulation study in scenarios in which we vary the degree of error correlation, the group size balance, and the proportion of treated groups. Third, we provide an empirical example using the Survey of Health, Ageing, and Retirement in Europe. When the number of groups is small, CSE are systematically biased downwards in scenarios when data are unbalanced or when there is a low proportion of treated groups. This can result in over-rejection of the null even when data are composed of up to 50 groups. Aggregation, permutation tests, bias-adjusted GEE, and wild cluster bootstrap produce coverage rates close to the nominal rate for almost all scenarios, though GEE may suffer from low power. In DID estimation with a small number of groups, analysis using aggregation, permutation tests, wild cluster bootstrap, or bias-adjusted GEE is recommended.

  7. Statistical Techniques for Assessing water‐quality effects of BMPs

    USGS Publications Warehouse

    Walker, John F.

    1994-01-01

    Little has been published on the effectiveness of various management practices in small rural lakes and streams at the watershed scale. In this study, statistical techniques were used to test for changes in water‐quality data from watersheds where best management practices (BMPs) were implemented. Reductions in data variability due to climate and seasonality were accomplished through the use of regression methods. This study discusses the merits of using storm‐mass‐transport data as a means of improving the ability to detect BMP effects on stream‐water quality. Statistical techniques were applied to suspended‐sediment records from three rural watersheds in Illinois for the period 1981–84. None of the techniques identified changes in suspended sediment, primarily because of the small degree of BMP implementation and because of potential errors introduced through the estimation of storm‐mass transport. A Monte Carlo sensitivity analysis was used to determine the level of discrete change that could be detected for each watershed. In all cases, the use of regressions improved the ability to detect trends.Read More: http://ascelibrary.org/doi/abs/10.1061/(ASCE)0733-9437(1994)120:2(334)

  8. Predicting protein concentrations with ELISA microarray assays, monotonic splines and Monte Carlo simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daly, Don S.; Anderson, Kevin K.; White, Amanda M.

    Background: A microarray of enzyme-linked immunosorbent assays, or ELISA microarray, predicts simultaneously the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Making sound biological inferences as well as improving the ELISA microarray process require require both concentration predictions and creditable estimates of their errors. Methods: We present a statistical method based on monotonic spline statistical models, penalized constrained least squares fitting (PCLS) and Monte Carlo simulation (MC) to predict concentrations and estimate prediction errors in ELISA microarray. PCLS restrains the flexible spline to a fit of assay intensitymore » that is a monotone function of protein concentration. With MC, both modeling and measurement errors are combined to estimate prediction error. The spline/PCLS/MC method is compared to a common method using simulated and real ELISA microarray data sets. Results: In contrast to the rigid logistic model, the flexible spline model gave credible fits in almost all test cases including troublesome cases with left and/or right censoring, or other asymmetries. For the real data sets, 61% of the spline predictions were more accurate than their comparable logistic predictions; especially the spline predictions at the extremes of the prediction curve. The relative errors of 50% of comparable spline and logistic predictions differed by less than 20%. Monte Carlo simulation rendered acceptable asymmetric prediction intervals for both spline and logistic models while propagation of error produced symmetric intervals that diverged unrealistically as the standard curves approached horizontal asymptotes. Conclusions: The spline/PCLS/MC method is a flexible, robust alternative to a logistic/NLS/propagation-of-error method to reliably predict protein concentrations and estimate their errors. The spline method simplifies model selection and fitting, and reliably estimates believable prediction errors. For the 50% of the real data sets fit well by both methods, spline and logistic predictions are practically indistinguishable, varying in accuracy by less than 15%. The spline method may be useful when automated prediction across simultaneous assays of numerous proteins must be applied routinely with minimal user intervention.« less

  9. Consistency errors in p-values reported in Spanish psychology journals.

    PubMed

    Caperos, José Manuel; Pardo, Antonio

    2013-01-01

    Recent reviews have drawn attention to frequent consistency errors when reporting statistical results. We have reviewed the statistical results reported in 186 articles published in four Spanish psychology journals. Of these articles, 102 contained at least one of the statistics selected for our study: Fisher-F , Student-t and Pearson-c 2 . Out of the 1,212 complete statistics reviewed, 12.2% presented a consistency error, meaning that the reported p-value did not correspond to the reported value of the statistic and its degrees of freedom. In 2.3% of the cases, the correct calculation would have led to a different conclusion than the reported one. In terms of articles, 48% included at least one consistency error, and 17.6% would have to change at least one conclusion. In meta-analytical terms, with a focus on effect size, consistency errors can be considered substantial in 9.5% of the cases. These results imply a need to improve the quality and precision with which statistical results are reported in Spanish psychology journals.

  10. Light meson form factors at high Q2 from lattice QCD

    NASA Astrophysics Data System (ADS)

    Koponen, Jonna; Zimermmane-Santos, André; Davies, Christine; Lepage, G. Peter; Lytle, Andrew

    2018-03-01

    Measurements and theoretical calculations of meson form factors are essential for our understanding of internal hadron structure and QCD, the dynamics that bind the quarks in hadrons. The pion electromagnetic form factor has been measured at small space-like momentum transfer |q2| < 0.3 GeV2 by pion scattering from atomic electrons and at values up to 2.5 GeV2 by scattering electrons from the pion cloud around a proton. On the other hand, in the limit of very large (or infinite) Q2 = -q2, perturbation theory is applicable. This leaves a gap in the intermediate Q2 where the form factors are not known. As a part of their 12 GeV upgrade Jefferson Lab will measure pion and kaon form factors in this intermediate region, up to Q2 of 6 GeV2. This is then an ideal opportunity for lattice QCD to make an accurate prediction ahead of the experimental results. Lattice QCD provides a from-first-principles approach to calculate form factors, and the challenge here is to control the statistical and systematic uncertainties as errors grow when going to higher Q2 values. Here we report on a calculation that tests the method using an ηs meson, a 'heavy pion' made of strange quarks, and also present preliminary results for kaon and pion form factors. We use the nf = 2 + 1 + 1 ensembles made by the MILC collaboration and Highly Improved Staggered Quarks, which allows us to obtain high statistics. The HISQ action is also designed to have small dicretisation errors. Using several light quark masses and lattice spacings allows us to control the chiral and continuum extrapolation and keep systematic errors in check. Warning, no authors found for 2018EPJWC.17506016.

  11. Statistical analysis of modeling error in structural dynamic systems

    NASA Technical Reports Server (NTRS)

    Hasselman, T. K.; Chrostowski, J. D.

    1990-01-01

    The paper presents a generic statistical model of the (total) modeling error for conventional space structures in their launch configuration. Modeling error is defined as the difference between analytical prediction and experimental measurement. It is represented by the differences between predicted and measured real eigenvalues and eigenvectors. Comparisons are made between pre-test and post-test models. Total modeling error is then subdivided into measurement error, experimental error and 'pure' modeling error, and comparisons made between measurement error and total modeling error. The generic statistical model presented in this paper is based on the first four global (primary structure) modes of four different structures belonging to the generic category of Conventional Space Structures (specifically excluding large truss-type space structures). As such, it may be used to evaluate the uncertainty of predicted mode shapes and frequencies, sinusoidal response, or the transient response of other structures belonging to the same generic category.

  12. Sampling errors in the estimation of empirical orthogonal functions. [for climatology studies

    NASA Technical Reports Server (NTRS)

    North, G. R.; Bell, T. L.; Cahalan, R. F.; Moeng, F. J.

    1982-01-01

    Empirical Orthogonal Functions (EOF's), eigenvectors of the spatial cross-covariance matrix of a meteorological field, are reviewed with special attention given to the necessary weighting factors for gridded data and the sampling errors incurred when too small a sample is available. The geographical shape of an EOF shows large intersample variability when its associated eigenvalue is 'close' to a neighboring one. A rule of thumb indicating when an EOF is likely to be subject to large sampling fluctuations is presented. An explicit example, based on the statistics of the 500 mb geopotential height field, displays large intersample variability in the EOF's for sample sizes of a few hundred independent realizations, a size seldom exceeded by meteorological data sets.

  13. Statistical error in simulations of Poisson processes: Example of diffusion in solids

    NASA Astrophysics Data System (ADS)

    Nilsson, Johan O.; Leetmaa, Mikael; Vekilova, Olga Yu.; Simak, Sergei I.; Skorodumova, Natalia V.

    2016-08-01

    Simulations of diffusion in solids often produce poor statistics of diffusion events. We present an analytical expression for the statistical error in ion conductivity obtained in such simulations. The error expression is not restricted to any computational method in particular, but valid in the context of simulation of Poisson processes in general. This analytical error expression is verified numerically for the case of Gd-doped ceria by running a large number of kinetic Monte Carlo calculations.

  14. A minimalist approach to bias estimation for passive sensor measurements with targets of opportunity

    NASA Astrophysics Data System (ADS)

    Belfadel, Djedjiga; Osborne, Richard W.; Bar-Shalom, Yaakov

    2013-09-01

    In order to carry out data fusion, registration error correction is crucial in multisensor systems. This requires estimation of the sensor measurement biases. It is important to correct for these bias errors so that the multiple sensor measurements and/or tracks can be referenced as accurately as possible to a common tracking coordinate system. This paper provides a solution for bias estimation for the minimum number of passive sensors (two), when only targets of opportunity are available. The sensor measurements are assumed time-coincident (synchronous) and perfectly associated. Since these sensors provide only line of sight (LOS) measurements, the formation of a single composite Cartesian measurement obtained from fusing the LOS measurements from different sensors is needed to avoid the need for nonlinear filtering. We evaluate the Cramer-Rao Lower Bound (CRLB) on the covariance of the bias estimate, i.e., the quantification of the available information about the biases. Statistical tests on the results of simulations show that this method is statistically efficient, even for small sample sizes (as few as two sensors and six points on the trajectory of a single target of opportunity). We also show that the RMS position error is significantly improved with bias estimation compared with the target position estimation using the original biased measurements.

  15. Standard Errors and Confidence Intervals of Norm Statistics for Educational and Psychological Tests.

    PubMed

    Oosterhuis, Hannah E M; van der Ark, L Andries; Sijtsma, Klaas

    2016-11-14

    Norm statistics allow for the interpretation of scores on psychological and educational tests, by relating the test score of an individual test taker to the test scores of individuals belonging to the same gender, age, or education groups, et cetera. Given the uncertainty due to sampling error, one would expect researchers to report standard errors for norm statistics. In practice, standard errors are seldom reported; they are either unavailable or derived under strong distributional assumptions that may not be realistic for test scores. We derived standard errors for four norm statistics (standard deviation, percentile ranks, stanine boundaries and Z-scores) under the mild assumption that the test scores are multinomially distributed. A simulation study showed that the standard errors were unbiased and that corresponding Wald-based confidence intervals had good coverage. Finally, we discuss the possibilities for applying the standard errors in practical test use in education and psychology. The procedure is provided via the R function check.norms, which is available in the mokken package.

  16. Resampling-based Methods in Single and Multiple Testing for Equality of Covariance/Correlation Matrices

    PubMed Central

    Yang, Yang; DeGruttola, Victor

    2016-01-01

    Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients. PMID:22740584

  17. Resampling-based methods in single and multiple testing for equality of covariance/correlation matrices.

    PubMed

    Yang, Yang; DeGruttola, Victor

    2012-06-22

    Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients.

  18. Challenges of Big Data Analysis.

    PubMed

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  19. Challenges of Big Data Analysis

    PubMed Central

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-01-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469

  20. Teaching Statistics Online Using "Excel"

    ERIC Educational Resources Information Center

    Jerome, Lawrence

    2011-01-01

    As anyone who has taught or taken a statistics course knows, statistical calculations can be tedious and error-prone, with the details of a calculation sometimes distracting students from understanding the larger concepts. Traditional statistics courses typically use scientific calculators, which can relieve some of the tedium and errors but…

  1. Uncertainty Analysis and Order-by-Order Optimization of Chiral Nuclear Interactions

    DOE PAGES

    Carlsson, Boris; Forssen, Christian; Fahlin Strömberg, D.; ...

    2016-02-24

    Chiral effective field theory ( ΧEFT) provides a systematic approach to describe low-energy nuclear forces. Moreover, EFT is able to provide well-founded estimates of statistical and systematic uncertainties | although this unique advantage has not yet been fully exploited. We ll this gap by performing an optimization and statistical analysis of all the low-energy constants (LECs) up to next-to-next-to-leading order. Our optimization protocol corresponds to a simultaneous t to scattering and bound-state observables in the pion-nucleon, nucleon-nucleon, and few-nucleon sectors, thereby utilizing the full model capabilities of EFT. Finally, we study the effect on other observables by demonstrating forward-error-propagation methodsmore » that can easily be adopted by future works. We employ mathematical optimization and implement automatic differentiation to attain e cient and machine-precise first- and second-order derivatives of the objective function with respect to the LECs. This is also vital for the regression analysis. We use power-counting arguments to estimate the systematic uncertainty that is inherent to EFT and we construct chiral interactions at different orders with quantified uncertainties. Statistical error propagation is compared with Monte Carlo sampling showing that statistical errors are in general small compared to systematic ones. In conclusion, we find that a simultaneous t to different sets of data is critical to (i) identify the optimal set of LECs, (ii) capture all relevant correlations, (iii) reduce the statistical uncertainty, and (iv) attain order-by-order convergence in EFT. Furthermore, certain systematic uncertainties in the few-nucleon sector are shown to get substantially magnified in the many-body sector; in particlar when varying the cutoff in the chiral potentials. The methodology and results presented in this Paper open a new frontier for uncertainty quantification in ab initio nuclear theory.« less

  2. The heterogeneity statistic I(2) can be biased in small meta-analyses.

    PubMed

    von Hippel, Paul T

    2015-04-14

    Estimated effects vary across studies, partly because of random sampling error and partly because of heterogeneity. In meta-analysis, the fraction of variance that is due to heterogeneity is estimated by the statistic I(2). We calculate the bias of I(2), focusing on the situation where the number of studies in the meta-analysis is small. Small meta-analyses are common; in the Cochrane Library, the median number of studies per meta-analysis is 7 or fewer. We use Mathematica software to calculate the expectation and bias of I(2). I(2) has a substantial bias when the number of studies is small. The bias is positive when the true fraction of heterogeneity is small, but the bias is typically negative when the true fraction of heterogeneity is large. For example, with 7 studies and no true heterogeneity, I(2) will overestimate heterogeneity by an average of 12 percentage points, but with 7 studies and 80 percent true heterogeneity, I(2) can underestimate heterogeneity by an average of 28 percentage points. Biases of 12-28 percentage points are not trivial when one considers that, in the Cochrane Library, the median I(2) estimate is 21 percent. The point estimate I(2) should be interpreted cautiously when a meta-analysis has few studies. In small meta-analyses, confidence intervals should supplement or replace the biased point estimate I(2).

  3. Modeling and characterization of multipath in global navigation satellite system ranging signals

    NASA Astrophysics Data System (ADS)

    Weiss, Jan Peter

    The Global Positioning System (GPS) provides position, velocity, and time information to users in anywhere near the earth in real-time and regardless of weather conditions. Since the system became operational, improvements in many areas have reduced systematic errors affecting GPS measurements such that multipath, defined as any signal taking a path other than the direct, has become a significant, if not dominant, error source for many applications. This dissertation utilizes several approaches to characterize and model multipath errors in GPS measurements. Multipath errors in GPS ranging signals are characterized for several receiver systems and environments. Experimental P(Y) code multipath data are analyzed for ground stations with multipath levels ranging from minimal to severe, a C-12 turboprop, an F-18 jet, and an aircraft carrier. Comparisons between receivers utilizing single patch antennas and multi-element arrays are also made. In general, the results show significant reductions in multipath with antenna array processing, although large errors can occur even with this kind of equipment. Analysis of airborne platform multipath shows that the errors tend to be small in magnitude because the size of the aircraft limits the geometric delay of multipath signals, and high in frequency because aircraft dynamics cause rapid variations in geometric delay. A comprehensive multipath model is developed and validated. The model integrates 3D structure models, satellite ephemerides, electromagnetic ray-tracing algorithms, and detailed antenna and receiver models to predict multipath errors. Validation is performed by comparing experimental and simulated multipath via overall error statistics, per satellite time histories, and frequency content analysis. The validation environments include two urban buildings, an F-18, an aircraft carrier, and a rural area where terrain multipath dominates. The validated models are used to identify multipath sources, characterize signal properties, evaluate additional antenna and receiver tracking configurations, and estimate the reflection coefficients of multipath-producing surfaces. Dynamic models for an F-18 landing on an aircraft carrier correlate aircraft dynamics to multipath frequency content; the model also characterizes the separate contributions of multipath due to the aircraft, ship, and ocean to the overall error statistics. Finally, reflection coefficients for multipath produced by terrain are estimated via a least-squares algorithm.

  4. Effects of 2D and 3D Error Fields on the SAS Divertor Magnetic Topology

    NASA Astrophysics Data System (ADS)

    Trevisan, G. L.; Lao, L. L.; Strait, E. J.; Guo, H. Y.; Wu, W.; Evans, T. E.

    2016-10-01

    The successful design of plasma-facing components in fusion experiments is of paramount importance in both the operation of future reactors and in the modification of operating machines. Indeed, the Small Angle Slot (SAS) divertor concept, proposed for application on the DIII-D experiment, combines a small incident angle at the plasma strike point with a progressively opening slot, so as to better control heat flux and erosion in high-performance tokamak plasmas. Uncertainty quantification of the error fields expected around the striking point provides additional useful information in both the design and the modeling phases of the new divertor, in part due to the particular geometric requirement of the striking flux surfaces. The presented work involves both 2D and 3D magnetic error field analysis on the SAS strike point carried out using the EFIT code for 2D equilibrium reconstruction, V3POST for vacuum 3D computations and the OMFIT integrated modeling framework for data analysis. An uncertainty in the magnetic probes' signals is found to propagate non-linearly as an uncertainty in the striking point and angle, which can be quantified through statistical analysis to yield robust estimates. Work supported by contracts DE-FG02-95ER54309 and DE-FC02-04ER54698.

  5. An Extended Objective Evaluation of the 29-km Eta Model for Weather Support to the United States Space Program

    NASA Technical Reports Server (NTRS)

    Nutter, Paul; Manobianco, John

    1998-01-01

    This report describes the Applied Meteorology Unit's objective verification of the National Centers for Environmental Prediction 29-km eta model during separate warm and cool season periods from May 1996 through January 1998. The verification of surface and upper-air point forecasts was performed at three selected stations important for 45th Weather Squadron, Spaceflight Meteorology Group, and National Weather Service, Melbourne operational weather concerns. The statistical evaluation identified model biases that may result from inadequate parameterization of physical processes. Since model biases are relatively small compared to the random error component, most of the total model error results from day-to-day variability in the forecasts and/or observations. To some extent, these nonsystematic errors reflect the variability in point observations that sample spatial and temporal scales of atmospheric phenomena that cannot be resolved by the model. On average, Meso-Eta point forecasts provide useful guidance for predicting the evolution of the larger scale environment. A more substantial challenge facing model users in real time is the discrimination of nonsystematic errors that tend to inflate the total forecast error. It is important that model users maintain awareness of ongoing model changes. Such changes are likely to modify the basic error characteristics, particularly near the surface.

  6. Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study.

    PubMed

    Nour-Eldein, Hebatallah

    2016-01-01

    With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles.

  7. Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study

    PubMed Central

    Nour-Eldein, Hebatallah

    2016-01-01

    Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839

  8. Experimental design and statistical methods for improved hit detection in high-throughput screening.

    PubMed

    Malo, Nathalie; Hanley, James A; Carlile, Graeme; Liu, Jing; Pelletier, Jerry; Thomas, David; Nadon, Robert

    2010-09-01

    Identification of active compounds in high-throughput screening (HTS) contexts can be substantially improved by applying classical experimental design and statistical inference principles to all phases of HTS studies. The authors present both experimental and simulated data to illustrate how true-positive rates can be maximized without increasing false-positive rates by the following analytical process. First, the use of robust data preprocessing methods reduces unwanted variation by removing row, column, and plate biases. Second, replicate measurements allow estimation of the magnitude of the remaining random error and the use of formal statistical models to benchmark putative hits relative to what is expected by chance. Receiver Operating Characteristic (ROC) analyses revealed superior power for data preprocessed by a trimmed-mean polish method combined with the RVM t-test, particularly for small- to moderate-sized biological hits.

  9. Impacts of uncertainties in European gridded precipitation observations on regional climate analysis.

    PubMed

    Prein, Andreas F; Gobiet, Andreas

    2017-01-01

    Gridded precipitation data sets are frequently used to evaluate climate models or to remove model output biases. Although precipitation data are error prone due to the high spatio-temporal variability of precipitation and due to considerable measurement errors, relatively few attempts have been made to account for observational uncertainty in model evaluation or in bias correction studies. In this study, we compare three types of European daily data sets featuring two Pan-European data sets and a set that combines eight very high-resolution station-based regional data sets. Furthermore, we investigate seven widely used, larger scale global data sets. Our results demonstrate that the differences between these data sets have the same magnitude as precipitation errors found in regional climate models. Therefore, including observational uncertainties is essential for climate studies, climate model evaluation, and statistical post-processing. Following our results, we suggest the following guidelines for regional precipitation assessments. (1) Include multiple observational data sets from different sources (e.g. station, satellite, reanalysis based) to estimate observational uncertainties. (2) Use data sets with high station densities to minimize the effect of precipitation undersampling (may induce about 60% error in data sparse regions). The information content of a gridded data set is mainly related to its underlying station density and not to its grid spacing. (3) Consider undercatch errors of up to 80% in high latitudes and mountainous regions. (4) Analyses of small-scale features and extremes are especially uncertain in gridded data sets. For higher confidence, use climate-mean and larger scale statistics. In conclusion, neglecting observational uncertainties potentially misguides climate model development and can severely affect the results of climate change impact assessments.

  10. Power of mental health nursing research: a statistical analysis of studies in the International Journal of Mental Health Nursing.

    PubMed

    Gaskin, Cadeyrn J; Happell, Brenda

    2013-02-01

    Having sufficient power to detect effect sizes of an expected magnitude is a core consideration when designing studies in which inferential statistics will be used. The main aim of this study was to investigate the statistical power in studies published in the International Journal of Mental Health Nursing. From volumes 19 (2010) and 20 (2011) of the journal, studies were analysed for their power to detect small, medium, and large effect sizes, according to Cohen's guidelines. The power of the 23 studies included in this review to detect small, medium, and large effects was 0.34, 0.79, and 0.94, respectively. In 90% of papers, no adjustments for experiment-wise error were reported. With a median of nine inferential tests per paper, the mean experiment-wise error rate was 0.51. A priori power analyses were only reported in 17% of studies. Although effect sizes for correlations and regressions were routinely reported, effect sizes for other tests (χ(2)-tests, t-tests, ANOVA/MANOVA) were largely absent from the papers. All types of effect sizes were infrequently interpreted. Researchers are strongly encouraged to conduct power analyses when designing studies, and to avoid scattergun approaches to data analysis (i.e. undertaking large numbers of tests in the hope of finding 'significant' results). Because reviewing effect sizes is essential for determining the clinical significance of study findings, researchers would better serve the field of mental health nursing if they reported and interpreted effect sizes. © 2012 The Authors. International Journal of Mental Health Nursing © 2012 Australian College of Mental Health Nurses Inc.

  11. Small convolution kernels for high-fidelity image restoration

    NASA Technical Reports Server (NTRS)

    Reichenbach, Stephen E.; Park, Stephen K.

    1991-01-01

    An algorithm is developed for computing the mean-square-optimal values for small, image-restoration kernels. The algorithm is based on a comprehensive, end-to-end imaging system model that accounts for the important components of the imaging process: the statistics of the scene, the point-spread function of the image-gathering device, sampling effects, noise, and display reconstruction. Subject to constraints on the spatial support of the kernel, the algorithm generates the kernel values that restore the image with maximum fidelity, that is, the kernel minimizes the expected mean-square restoration error. The algorithm is consistent with the derivation of the spatially unconstrained Wiener filter, but leads to a small, spatially constrained kernel that, unlike the unconstrained filter, can be efficiently implemented by convolution. Simulation experiments demonstrate that for a wide range of imaging systems these small kernels can restore images with fidelity comparable to images restored with the unconstrained Wiener filter.

  12. Accounting for measurement error: a critical but often overlooked process.

    PubMed

    Harris, Edward F; Smith, Richard N

    2009-12-01

    Due to instrument imprecision and human inconsistencies, measurements are not free of error. Technical error of measurement (TEM) is the variability encountered between dimensions when the same specimens are measured at multiple sessions. A goal of a data collection regimen is to minimise TEM. The few studies that actually quantify TEM, regardless of discipline, report that it is substantial and can affect results and inferences. This paper reviews some statistical approaches for identifying and controlling TEM. Statistically, TEM is part of the residual ('unexplained') variance in a statistical test, so accounting for TEM, which requires repeated measurements, enhances the chances of finding a statistically significant difference if one exists. The aim of this paper was to review and discuss common statistical designs relating to types of error and statistical approaches to error accountability. This paper addresses issues of landmark location, validity, technical and systematic error, analysis of variance, scaled measures and correlation coefficients in order to guide the reader towards correct identification of true experimental differences. Researchers commonly infer characteristics about populations from comparatively restricted study samples. Most inferences are statistical and, aside from concerns about adequate accounting for known sources of variation with the research design, an important source of variability is measurement error. Variability in locating landmarks that define variables is obvious in odontometrics, cephalometrics and anthropometry, but the same concerns about measurement accuracy and precision extend to all disciplines. With increasing accessibility to computer-assisted methods of data collection, the ease of incorporating repeated measures into statistical designs has improved. Accounting for this technical source of variation increases the chance of finding biologically true differences when they exist.

  13. Analysis of uncertainties and convergence of the statistical quantities in turbulent wall-bounded flows by means of a physically based criterion

    NASA Astrophysics Data System (ADS)

    Andrade, João Rodrigo; Martins, Ramon Silva; Thompson, Roney Leon; Mompean, Gilmar; da Silveira Neto, Aristeu

    2018-04-01

    The present paper provides an analysis of the statistical uncertainties associated with direct numerical simulation (DNS) results and experimental data for turbulent channel and pipe flows, showing a new physically based quantification of these errors, to improve the determination of the statistical deviations between DNSs and experiments. The analysis is carried out using a recently proposed criterion by Thompson et al. ["A methodology to evaluate statistical errors in DNS data of plane channel flows," Comput. Fluids 130, 1-7 (2016)] for fully turbulent plane channel flows, where the mean velocity error is estimated by considering the Reynolds stress tensor, and using the balance of the mean force equation. It also presents how the residual error evolves in time for a DNS of a plane channel flow, and the influence of the Reynolds number on its convergence rate. The root mean square of the residual error is shown in order to capture a single quantitative value of the error associated with the dimensionless averaging time. The evolution in time of the error norm is compared with the final error provided by DNS data of similar Reynolds numbers available in the literature. A direct consequence of this approach is that it was possible to compare different numerical results and experimental data, providing an improved understanding of the convergence of the statistical quantities in turbulent wall-bounded flows.

  14. Network problem threshold

    NASA Technical Reports Server (NTRS)

    Gejji, Raghvendra, R.

    1992-01-01

    Network transmission errors such as collisions, CRC errors, misalignment, etc. are statistical in nature. Although errors can vary randomly, a high level of errors does indicate specific network problems, e.g. equipment failure. In this project, we have studied the random nature of collisions theoretically as well as by gathering statistics, and established a numerical threshold above which a network problem is indicated with high probability.

  15. Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

    PubMed

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

    2012-01-01

    As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright © 2013 S. Karger AG, Basel.

  16. Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

    PubMed Central

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

    2013-01-01

    As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci. PMID:23594495

  17. Error and its meaning in forensic science.

    PubMed

    Christensen, Angi M; Crowder, Christian M; Ousley, Stephen D; Houck, Max M

    2014-01-01

    The discussion of "error" has gained momentum in forensic science in the wake of the Daubert guidelines and has intensified with the National Academy of Sciences' Report. Error has many different meanings, and too often, forensic practitioners themselves as well as the courts misunderstand scientific error and statistical error rates, often confusing them with practitioner error (or mistakes). Here, we present an overview of these concepts as they pertain to forensic science applications, discussing the difference between practitioner error (including mistakes), instrument error, statistical error, and method error. We urge forensic practitioners to ensure that potential sources of error and method limitations are understood and clearly communicated and advocate that the legal community be informed regarding the differences between interobserver errors, uncertainty, variation, and mistakes. © 2013 American Academy of Forensic Sciences.

  18. Statistical improvement in detection level of gravitational microlensing events from their light curves

    NASA Astrophysics Data System (ADS)

    Ibrahim, Ichsan; Malasan, Hakim L.; Kunjaya, Chatief; Timur Jaelani, Anton; Puannandra Putri, Gerhana; Djamal, Mitra

    2018-04-01

    In astronomy, the brightness of a source is typically expressed in terms of magnitude. Conventionally, the magnitude is defined by the logarithm of received flux. This relationship is known as the Pogson formula. For received flux with a small signal to noise ratio (S/N), however, the formula gives a large magnitude error. We investigate whether the use of Inverse Hyperbolic Sine function (hereafter referred to as the Asinh magnitude) in the modified formulae could allow for an alternative calculation of magnitudes for small S/N flux, and whether the new approach is better for representing the brightness of that region. We study the possibility of increasing the detection level of gravitational microlensing using 40 selected microlensing light curves from the 2013 and 2014 seasons and by using the Asinh magnitude. Photometric data of the selected events are obtained from the Optical Gravitational Lensing Experiment (OGLE). We found that utilization of the Asinh magnitude makes the events brighter compared to using the logarithmic magnitude, with an average of about 3.42 × 10‑2 magnitude and an average in the difference of error between the logarithmic and the Asinh magnitude of about 2.21 × 10‑2 magnitude. The microlensing events OB140847 and OB140885 are found to have the largest difference values among the selected events. Using a Gaussian fit to find the peak for OB140847 and OB140885, we conclude statistically that the Asinh magnitude gives better mean squared values of the regression and narrower residual histograms than the Pogson magnitude. Based on these results, we also attempt to propose a limit in magnitude value for which use of the Asinh magnitude is optimal with small S/N data.

  19. Death Certification Errors and the Effect on Mortality Statistics.

    PubMed

    McGivern, Lauri; Shulman, Leanne; Carney, Jan K; Shapiro, Steven; Bundock, Elizabeth

    Errors in cause and manner of death on death certificates are common and affect families, mortality statistics, and public health research. The primary objective of this study was to characterize errors in the cause and manner of death on death certificates completed by non-Medical Examiners. A secondary objective was to determine the effects of errors on national mortality statistics. We retrospectively compared 601 death certificates completed between July 1, 2015, and January 31, 2016, from the Vermont Electronic Death Registration System with clinical summaries from medical records. Medical Examiners, blinded to original certificates, reviewed summaries, generated mock certificates, and compared mock certificates with original certificates. They then graded errors using a scale from 1 to 4 (higher numbers indicated increased impact on interpretation of the cause) to determine the prevalence of minor and major errors. They also compared International Classification of Diseases, 10th Revision (ICD-10) codes on original certificates with those on mock certificates. Of 601 original death certificates, 319 (53%) had errors; 305 (51%) had major errors; and 59 (10%) had minor errors. We found no significant differences by certifier type (physician vs nonphysician). We did find significant differences in major errors in place of death ( P < .001). Certificates for deaths occurring in hospitals were more likely to have major errors than certificates for deaths occurring at a private residence (59% vs 39%, P < .001). A total of 580 (93%) death certificates had a change in ICD-10 codes between the original and mock certificates, of which 348 (60%) had a change in the underlying cause-of-death code. Error rates on death certificates in Vermont are high and extend to ICD-10 coding, thereby affecting national mortality statistics. Surveillance and certifier education must expand beyond local and state efforts. Simplifying and standardizing underlying literal text for cause of death may improve accuracy, decrease coding errors, and improve national mortality statistics.

  20. Quantitative analysis of the correlations in the Boltzmann-Grad limit for hard spheres

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pulvirenti, M.

    2014-12-09

    In this contribution I consider the problem of the validity of the Boltzmann equation for a system of hard spheres in the Boltzmann-Grad limit. I briefly review the results available nowadays with a particular emphasis on the celebrated Lanford’s validity theorem. Finally I present some recent results, obtained in collaboration with S. Simonella, concerning a quantitative analysis of the propagation of chaos. More precisely we introduce a quantity (the correlation error) measuring how close a j-particle rescaled correlation function at time t (sufficiently small) is far from the full statistical independence. Roughly speaking, a correlation error of order k, measuresmore » (in the context of the BBKGY hierarchy) the event in which k tagged particles form a recolliding group.« less

  1. The (mis)reporting of statistical results in psychology journals.

    PubMed

    Bakker, Marjan; Wicherts, Jelte M

    2011-09-01

    In order to study the prevalence, nature (direction), and causes of reporting errors in psychology, we checked the consistency of reported test statistics, degrees of freedom, and p values in a random sample of high- and low-impact psychology journals. In a second study, we established the generality of reporting errors in a random sample of recent psychological articles. Our results, on the basis of 281 articles, indicate that around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers' expectations. We classified the most common errors and contacted authors to shed light on the origins of the errors.

  2. Chamber measurement of surface-atmosphere trace gas exchange: Numerical evaluation of dependence on soil, interfacial layer, and source/sink properties

    NASA Astrophysics Data System (ADS)

    Hutchinson, G. L.; Livingston, G. P.; Healy, R. W.; Striegl, R. G.

    2000-04-01

    We employed a three-dimensional finite difference gas diffusion model to simulate the performance of chambers used to measure surface-atmosphere trace gas exchange. We found that systematic errors often result from conventional chamber design and deployment protocols, as well as key assumptions behind the estimation of trace gas exchange rates from observed concentration data. Specifically, our simulations showed that (1) when a chamber significantly alters atmospheric mixing processes operating near the soil surface, it also nearly instantaneously enhances or suppresses the postdeployment gas exchange rate, (2) any change resulting in greater soil gas diffusivity, or greater partitioning of the diffusing gas to solid or liquid soil fractions, increases the potential for chamber-induced measurement error, and (3) all such errors are independent of the magnitude, kinetics, and/or distribution of trace gas sources, but greater for trace gas sinks with the same initial absolute flux. Finally, and most importantly, we found that our results apply to steady state as well as non-steady-state chambers, because the slow rate of gas diffusion in soil inhibits recovery of the former from their initial non-steady-state condition. Over a range of representative conditions, the error in steady state chamber estimates of the trace gas flux varied from -30 to +32%, while estimates computed by linear regression from non-steady-state chamber concentrations were 2 to 31% too small. Although such errors are relatively small in comparison to the temporal and spatial variability characteristic of trace gas exchange, they bias the summary statistics for each experiment as well as larger scale trace gas flux estimates based on them.

  3. Chamber measurement of surface-atmosphere trace gas exchange--Numerical evaluation of dependence on soil interfacial layer, and source/sink products

    USGS Publications Warehouse

    Hutchinson, G.L.; Livingston, G.P.; Healy, R.W.; Striegl, Robert G.

    2000-01-01

    We employed a three-dimensional finite difference gas diffusion model to simulate the performance of chambers used to measure surface-atmosphere tace gas exchange. We found that systematic errors often result from conventional chamber design and deployment protocols, as well as key assumptions behind the estimation of trace gas exchange rates from observed concentration data. Specifically, our simulationshowed that (1) when a chamber significantly alters atmospheric mixing processes operating near the soil surface, it also nearly instantaneously enhances or suppresses the postdeployment gas exchange rate, (2) any change resulting in greater soil gas diffusivity, or greater partitioning of the diffusing gas to solid or liquid soil fractions, increases the potential for chamber-induced measurement error, and (3) all such errors are independent of the magnitude, kinetics, and/or distribution of trace gas sources, but greater for trace gas sinks with the same initial absolute flux. Finally, and most importantly, we found that our results apply to steady state as well as non-steady-state chambers, because the slow rate of gas diffusion in soil inhibits recovery of the former from their initial non-steady-state condition. Over a range of representative conditions, the error in steady state chamber estimates of the trace gas flux varied from -30 to +32%, while estimates computed by linear regression from non-steadystate chamber concentrations were 2 to 31% too small. Although such errors are relatively small in comparison to the temporal and spatial variability characteristic of trace gas exchange, they bias the summary statistics for each experiment as well as larger scale trace gas flux estimates based on them.

  4. Evaluation of solvation free energies for small molecules with the AMOEBA polarizable force field

    PubMed Central

    Mohamed, Noor Asidah; Bradshaw, Richard T.

    2016-01-01

    The effects of electronic polarization in biomolecular interactions will differ depending on the local dielectric constant of the environment, such as in solvent, DNA, proteins, and membranes. Here the performance of the AMOEBA polarizable force field is evaluated under nonaqueous conditions by calculating the solvation free energies of small molecules in four common organic solvents. Results are compared with experimental data and equivalent simulations performed with the GAFF pairwise‐additive force field. Although AMOEBA results give mean errors close to “chemical accuracy,” GAFF performs surprisingly well, with statistically significantly more accurate results than AMOEBA in some solvents. However, for both models, free energies calculated in chloroform show worst agreement to experiment and individual solutes are consistently poor performers, suggesting non‐potential‐specific errors also contribute to inaccuracy. Scope for the improvement of both potentials remains limited by the lack of high quality experimental data across multiple solvents, particularly those of high dielectric constant. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:27757978

  5. [The problem of small "n" and big "P" in neuropsycho-pharmacology, or how to keep the rate of false discoveries under control].

    PubMed

    Petschner, Péter; Bagdy, György; Tóthfalusi, Laszló

    2015-03-01

    One of the characteristics of many methods used in neuropsychopharmacology is that a large number of parameters (P) are measured in relatively few subjects (n). Functional magnetic resonance imaging, electroencephalography (EEG) and genomic studies are typical examples. For example one microarray chip can contain thousands of probes. Therefore, in studies using microarray chips, P may be several thousand-fold larger than n. Statistical analysis of such studies is a challenging task and they are refereed to in the statistical literature such as the small "n" big "P" problem. The problem has many facets including the controversies associated with multiple hypothesis testing. A typical scenario in this context is, when two or more groups are compared by the individual attributes. If the increased classification error due to the multiple testing is neglected, then several highly significant differences will be discovered. But in reality, some of these significant differences are coincidental, not reproducible findings. Several methods were proposed to solve this problem. In this review we discuss two of the proposed solutions, algorithms to compare sets and statistical hypothesis tests controlling the false discovery rate.

  6. Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science.

    PubMed

    Veldkamp, Coosje L S; Nuijten, Michèle B; Dominguez-Alvarez, Linda; van Assen, Marcel A L M; Wicherts, Jelte M

    2014-01-01

    Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this 'co-piloting' currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.

  7. Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science

    PubMed Central

    Veldkamp, Coosje L. S.; Nuijten, Michèle B.; Dominguez-Alvarez, Linda; van Assen, Marcel A. L. M.; Wicherts, Jelte M.

    2014-01-01

    Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this ‘co-piloting’ currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors. PMID:25493918

  8. Calculating Statistical Orbit Distributions Using GEO Optical Observations with the Michigan Orbital Debris Survey Telescope (MODEST)

    NASA Technical Reports Server (NTRS)

    Matney, M.; Barker, E.; Seitzer, P.; Abercromby, K. J.; Rodriquez, H. M.

    2006-01-01

    NASA's Orbital Debris measurements program has a goal to characterize the small debris environment in the geosynchronous Earth-orbit (GEO) region using optical telescopes ("small" refers to objects too small to catalog and track with current systems). Traditionally, observations of GEO and near-GEO objects involve following the object with the telescope long enough to obtain an orbit suitable for tracking purposes. Telescopes operating in survey mode, however, randomly observe objects that pass through their field of view. Typically, these short-arc observation are inadequate to obtain detailed orbits, but can be used to estimate approximate circular orbit elements (semimajor axis, inclination, and ascending node). From this information, it should be possible to make statistical inferences about the orbital distributions of the GEO population bright enough to be observed by the system. The Michigan Orbital Debris Survey Telescope (MODEST) has been making such statistical surveys of the GEO region for four years. During that time, the telescope has made enough observations in enough areas of the GEO belt to have had nearly complete coverage. That means that almost all objects in all possible orbits in the GEO and near- GEO region had a non-zero chance of being observed. Some regions (such as those near zero inclination) have had good coverage, while others are poorly covered. Nevertheless, it is possible to remove these statistical biases and reconstruct the orbit populations within the limits of sampling error. In this paper, these statistical techniques and assumptions are described, and the techniques are applied to the current MODEST data set to arrive at our best estimate of the GEO orbit population distribution.

  9. Phase Error Correction in Time-Averaged 3D Phase Contrast Magnetic Resonance Imaging of the Cerebral Vasculature

    PubMed Central

    MacDonald, M. Ethan; Forkert, Nils D.; Pike, G. Bruce; Frayne, Richard

    2016-01-01

    Purpose Volume flow rate (VFR) measurements based on phase contrast (PC)-magnetic resonance (MR) imaging datasets have spatially varying bias due to eddy current induced phase errors. The purpose of this study was to assess the impact of phase errors in time averaged PC-MR imaging of the cerebral vasculature and explore the effects of three common correction schemes (local bias correction (LBC), local polynomial correction (LPC), and whole brain polynomial correction (WBPC)). Methods Measurements of the eddy current induced phase error from a static phantom were first obtained. In thirty healthy human subjects, the methods were then assessed in background tissue to determine if local phase offsets could be removed. Finally, the techniques were used to correct VFR measurements in cerebral vessels and compared statistically. Results In the phantom, phase error was measured to be <2.1 ml/s per pixel and the bias was reduced with the correction schemes. In background tissue, the bias was significantly reduced, by 65.6% (LBC), 58.4% (LPC) and 47.7% (WBPC) (p < 0.001 across all schemes). Correction did not lead to significantly different VFR measurements in the vessels (p = 0.997). In the vessel measurements, the three correction schemes led to flow measurement differences of -0.04 ± 0.05 ml/s, 0.09 ± 0.16 ml/s, and -0.02 ± 0.06 ml/s. Although there was an improvement in background measurements with correction, there was no statistical difference between the three correction schemes (p = 0.242 in background and p = 0.738 in vessels). Conclusions While eddy current induced phase errors can vary between hardware and sequence configurations, our results showed that the impact is small in a typical brain PC-MR protocol and does not have a significant effect on VFR measurements in cerebral vessels. PMID:26910600

  10. Estimating true human and animal host source contribution in quantitative microbial source tracking using the Monte Carlo method.

    PubMed

    Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan

    2010-09-01

    Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.

  11. Evaluation of NMME temperature and precipitation bias and forecast skill for South Asia

    NASA Astrophysics Data System (ADS)

    Cash, Benjamin A.; Manganello, Julia V.; Kinter, James L.

    2017-08-01

    Systematic error and forecast skill for temperature and precipitation in two regions of Southern Asia are investigated using hindcasts initialized May 1 from the North American Multi-Model Ensemble. We focus on two contiguous but geographically and dynamically diverse regions: the Extended Indian Monsoon Rainfall (70-100E, 10-30 N) and the nearby mountainous area of Pakistan and Afghanistan (60-75E, 23-39 N). Forecast skill is assessed using the Sign test framework, a rigorous statistical method that can be applied to non-Gaussian variables such as precipitation and to different ensemble sizes without introducing bias. We find that models show significant systematic error in both precipitation and temperature for both regions. The multi-model ensemble mean (MMEM) consistently yields the lowest systematic error and the highest forecast skill for both regions and variables. However, we also find that the MMEM consistently provides a statistically significant increase in skill over climatology only in the first month of the forecast. While the MMEM tends to provide higher overall skill than climatology later in the forecast, the differences are not significant at the 95% level. We also find that MMEMs constructed with a relatively small number of ensemble members per model can equal or outperform MMEMs constructed with more members in skill. This suggests some ensemble members either provide no contribution to overall skill or even detract from it.

  12. Quantum error-correction failure distributions: Comparison of coherent and stochastic error models

    NASA Astrophysics Data System (ADS)

    Barnes, Jeff P.; Trout, Colin J.; Lucarelli, Dennis; Clader, B. D.

    2017-06-01

    We compare failure distributions of quantum error correction circuits for stochastic errors and coherent errors. We utilize a fully coherent simulation of a fault-tolerant quantum error correcting circuit for a d =3 Steane and surface code. We find that the output distributions are markedly different for the two error models, showing that no simple mapping between the two error models exists. Coherent errors create very broad and heavy-tailed failure distributions. This suggests that they are susceptible to outlier events and that mean statistics, such as pseudothreshold estimates, may not provide the key figure of merit. This provides further statistical insight into why coherent errors can be so harmful for quantum error correction. These output probability distributions may also provide a useful metric that can be utilized when optimizing quantum error correcting codes and decoding procedures for purely coherent errors.

  13. Tipping points in the arctic: eyeballing or statistical significance?

    PubMed

    Carstensen, Jacob; Weydmann, Agata

    2012-02-01

    Arctic ecosystems have experienced and are projected to experience continued large increases in temperature and declines in sea ice cover. It has been hypothesized that small changes in ecosystem drivers can fundamentally alter ecosystem functioning, and that this might be particularly pronounced for Arctic ecosystems. We present a suite of simple statistical analyses to identify changes in the statistical properties of data, emphasizing that changes in the standard error should be considered in addition to changes in mean properties. The methods are exemplified using sea ice extent, and suggest that the loss rate of sea ice accelerated by factor of ~5 in 1996, as reported in other studies, but increases in random fluctuations, as an early warning signal, were observed already in 1990. We recommend to employ the proposed methods more systematically for analyzing tipping points to document effects of climate change in the Arctic.

  14. Sampling Errors in Monthly Rainfall Totals for TRMM and SSM/I, Based on Statistics of Retrieved Rain Rates and Simple Models

    NASA Technical Reports Server (NTRS)

    Bell, Thomas L.; Kundu, Prasun K.; Einaudi, Franco (Technical Monitor)

    2000-01-01

    Estimates from TRMM satellite data of monthly total rainfall over an area are subject to substantial sampling errors due to the limited number of visits to the area by the satellite during the month. Quantitative comparisons of TRMM averages with data collected by other satellites and by ground-based systems require some estimate of the size of this sampling error. A method of estimating this sampling error based on the actual statistics of the TRMM observations and on some modeling work has been developed. "Sampling error" in TRMM monthly averages is defined here relative to the monthly total a hypothetical satellite permanently stationed above the area would have reported. "Sampling error" therefore includes contributions from the random and systematic errors introduced by the satellite remote sensing system. As part of our long-term goal of providing error estimates for each grid point accessible to the TRMM instruments, sampling error estimates for TRMM based on rain retrievals from TRMM microwave (TMI) data are compared for different times of the year and different oceanic areas (to minimize changes in the statistics due to algorithmic differences over land and ocean). Changes in sampling error estimates due to changes in rain statistics due 1) to evolution of the official algorithms used to process the data, and 2) differences from other remote sensing systems such as the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave/Imager (SSM/I), are analyzed.

  15. [Character of refractive errors in population study performed by the Area Military Medical Commission in Lodz].

    PubMed

    Nowak, Michał S; Goś, Roman; Smigielski, Janusz

    2008-01-01

    To determine the prevalence of refractive errors in population. A retrospective review of medical examinations for entry to the military service from The Area Military Medical Commission in Lodz. Ophthalmic examinations were performed. We used statistic analysis to review the results. Statistic analysis revealed that refractive errors occurred in 21.68% of the population. The most commen refractive error was myopia. 1) The most commen ocular diseases are refractive errors, especially myopia (21.68% in total). 2) Refractive surgery and contact lenses should be allowed as the possible correction of refractive errors for military service.

  16. Normality Tests for Statistical Analysis: A Guide for Non-Statisticians

    PubMed Central

    Ghasemi, Asghar; Zahediasl, Saleh

    2012-01-01

    Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808

  17. Mass measurement errors of Fourier-transform mass spectrometry (FTMS): distribution, recalibration, and application.

    PubMed

    Zhang, Jiyang; Ma, Jie; Dou, Lei; Wu, Songfeng; Qian, Xiaohong; Xie, Hongwei; Zhu, Yunping; He, Fuchu

    2009-02-01

    The hybrid linear trap quadrupole Fourier-transform (LTQ-FT) ion cyclotron resonance mass spectrometer, an instrument with high accuracy and resolution, is widely used in the identification and quantification of peptides and proteins. However, time-dependent errors in the system may lead to deterioration of the accuracy of these instruments, negatively influencing the determination of the mass error tolerance (MET) in database searches. Here, a comprehensive discussion of LTQ/FT precursor ion mass error is provided. On the basis of an investigation of the mass error distribution, we propose an improved recalibration formula and introduce a new tool, FTDR (Fourier-transform data recalibration), that employs a graphic user interface (GUI) for automatic calibration. It was found that the calibration could adjust the mass error distribution to more closely approximate a normal distribution and reduce the standard deviation (SD). Consequently, we present a new strategy, LDSF (Large MET database search and small MET filtration), for database search MET specification and validation of database search results. As the name implies, a large-MET database search is conducted and the search results are then filtered using the statistical MET estimated from high-confidence results. By applying this strategy to a standard protein data set and a complex data set, we demonstrate the LDSF can significantly improve the sensitivity of the result validation procedure.

  18. Error-Transparent Quantum Gates for Small Logical Qubit Architectures

    NASA Astrophysics Data System (ADS)

    Kapit, Eliot

    2018-02-01

    One of the largest obstacles to building a quantum computer is gate error, where the physical evolution of the state of a qubit or group of qubits during a gate operation does not match the intended unitary transformation. Gate error stems from a combination of control errors and random single qubit errors from interaction with the environment. While great strides have been made in mitigating control errors, intrinsic qubit error remains a serious problem that limits gate fidelity in modern qubit architectures. Simultaneously, recent developments of small error-corrected logical qubit devices promise significant increases in logical state lifetime, but translating those improvements into increases in gate fidelity is a complex challenge. In this Letter, we construct protocols for gates on and between small logical qubit devices which inherit the parent device's tolerance to single qubit errors which occur at any time before or during the gate. We consider two such devices, a passive implementation of the three-qubit bit flip code, and the author's own [E. Kapit, Phys. Rev. Lett. 116, 150501 (2016), 10.1103/PhysRevLett.116.150501] very small logical qubit (VSLQ) design, and propose error-tolerant gate sets for both. The effective logical gate error rate in these models displays superlinear error reduction with linear increases in single qubit lifetime, proving that passive error correction is capable of increasing gate fidelity. Using a standard phenomenological noise model for superconducting qubits, we demonstrate a realistic, universal one- and two-qubit gate set for the VSLQ, with error rates an order of magnitude lower than those for same-duration operations on single qubits or pairs of qubits. These developments further suggest that incorporating small logical qubits into a measurement based code could substantially improve code performance.

  19. Demographics of the gay and lesbian population in the United States: evidence from available systematic data sources.

    PubMed

    Black, D; Gates, G; Sanders, S; Taylor, L

    2000-05-01

    This work provides an overview of standard social science data sources that now allow some systematic study of the gay and lesbian population in the United States. For each data source, we consider how sexual orientation can be defined, and we note the potential sample sizes. We give special attention to the important problem of measurement error, especially the extent to which individuals recorded as gay and lesbian are indeed recorded correctly. Our concern is that because gays and lesbians constitute a relatively small fraction of the population, modest measurement problems could lead to serious errors in inference. In examining gays and lesbians in multiple data sets we also achieve a second objective: We provide a set of statistics about this population that is relevant to several current policy debates.

  20. Modified SPC for short run test and measurement process in multi-stations

    NASA Astrophysics Data System (ADS)

    Koh, C. K.; Chin, J. F.; Kamaruddin, S.

    2018-03-01

    Due to short production runs and measurement error inherent in electronic test and measurement (T&M) processes, continuous quality monitoring through real-time statistical process control (SPC) is challenging. Industry practice allows the installation of guard band using measurement uncertainty to reduce the width of acceptance limit, as an indirect way to compensate the measurement errors. This paper presents a new SPC model combining modified guard band and control charts (\\bar{\\text{Z}} chart and W chart) for short runs in T&M process in multi-stations. The proposed model standardizes the observed value with measurement target (T) and rationed measurement uncertainty (U). S-factor (S f) is introduced to the control limits to improve the sensitivity in detecting small shifts. The model was embedded in automated quality control system and verified with a case study in real industry.

  1. Physical Validation of TRMM TMI and PR Monthly Rain Products Over Oklahoma

    NASA Technical Reports Server (NTRS)

    Fisher, Brad L.

    2004-01-01

    The Tropical Rainfall Measuring Mission (TRMM) provides monthly rainfall estimates using data collected by the TRMM satellite. These estimates cover a substantial fraction of the earth's surface. The physical validation of TRMM estimates involves corroborating the accuracy of spaceborne estimates of areal rainfall by inferring errors and biases from ground-based rain estimates. The TRMM error budget consists of two major sources of error: retrieval and sampling. Sampling errors are intrinsic to the process of estimating monthly rainfall and occur because the satellite extrapolates monthly rainfall from a small subset of measurements collected only during satellite overpasses. Retrieval errors, on the other hand, are related to the process of collecting measurements while the satellite is overhead. One of the big challenges confronting the TRMM validation effort is how to best estimate these two main components of the TRMM error budget, which are not easily decoupled. This four-year study computed bulk sampling and retrieval errors for the TRMM microwave imager (TMI) and the precipitation radar (PR) by applying a technique that sub-samples gauge data at TRMM overpass times. Gridded monthly rain estimates are then computed from the monthly bulk statistics of the collected samples, providing a sensor-dependent gauge rain estimate that is assumed to include a TRMM equivalent sampling error. The sub-sampled gauge rain estimates are then used in conjunction with the monthly satellite and gauge (without sub- sampling) estimates to decouple retrieval and sampling errors. The computed mean sampling errors for the TMI and PR were 5.9% and 7.796, respectively, in good agreement with theoretical predictions. The PR year-to-year retrieval biases exceeded corresponding TMI biases, but it was found that these differences were partially due to negative TMI biases during cold months and positive TMI biases during warm months.

  2. Common Scientific and Statistical Errors in Obesity Research

    PubMed Central

    George, Brandon J.; Beasley, T. Mark; Brown, Andrew W.; Dawson, John; Dimova, Rositsa; Divers, Jasmin; Goldsby, TaShauna U.; Heo, Moonseong; Kaiser, Kathryn A.; Keith, Scott; Kim, Mimi Y.; Li, Peng; Mehta, Tapan; Oakes, J. Michael; Skinner, Asheley; Stuart, Elizabeth; Allison, David B.

    2015-01-01

    We identify 10 common errors and problems in the statistical analysis, design, interpretation, and reporting of obesity research and discuss how they can be avoided. The 10 topics are: 1) misinterpretation of statistical significance, 2) inappropriate testing against baseline values, 3) excessive and undisclosed multiple testing and “p-value hacking,” 4) mishandling of clustering in cluster randomized trials, 5) misconceptions about nonparametric tests, 6) mishandling of missing data, 7) miscalculation of effect sizes, 8) ignoring regression to the mean, 9) ignoring confirmation bias, and 10) insufficient statistical reporting. We hope that discussion of these errors can improve the quality of obesity research by helping researchers to implement proper statistical practice and to know when to seek the help of a statistician. PMID:27028280

  3. Low relative error in consumer-grade GPS units make them ideal for measuring small-scale animal movement patterns

    PubMed Central

    Severns, Paul M.

    2015-01-01

    Consumer-grade GPS units are a staple of modern field ecology, but the relatively large error radii reported by manufacturers (up to 10 m) ostensibly precludes their utility in measuring fine-scale movement of small animals such as insects. Here we demonstrate that for data collected at fine spatio-temporal scales, these devices can produce exceptionally accurate data on step-length and movement patterns of small animals. With an understanding of the properties of GPS error and how it arises, it is possible, using a simple field protocol, to use consumer grade GPS units to collect step-length data for the movement of small animals that introduces a median error as small as 11 cm. These small error rates were measured in controlled observations of real butterfly movement. Similar conclusions were reached using a ground-truth test track prepared with a field tape and compass and subsequently measured 20 times using the same methodology as the butterfly tracking. Median error in the ground-truth track was slightly higher than the field data, mostly between 20 and 30 cm, but even for the smallest ground-truth step (70 cm), this is still a signal-to-noise ratio of 3:1, and for steps of 3 m or more, the ratio is greater than 10:1. Such small errors relative to the movements being measured make these inexpensive units useful for measuring insect and other small animal movements on small to intermediate scales with budgets orders of magnitude lower than survey-grade units used in past studies. As an additional advantage, these units are simpler to operate, and insect or other small animal trackways can be collected more quickly than either survey-grade units or more traditional ruler/gird approaches. PMID:26312190

  4. Low relative error in consumer-grade GPS units make them ideal for measuring small-scale animal movement patterns.

    PubMed

    Breed, Greg A; Severns, Paul M

    2015-01-01

    Consumer-grade GPS units are a staple of modern field ecology, but the relatively large error radii reported by manufacturers (up to 10 m) ostensibly precludes their utility in measuring fine-scale movement of small animals such as insects. Here we demonstrate that for data collected at fine spatio-temporal scales, these devices can produce exceptionally accurate data on step-length and movement patterns of small animals. With an understanding of the properties of GPS error and how it arises, it is possible, using a simple field protocol, to use consumer grade GPS units to collect step-length data for the movement of small animals that introduces a median error as small as 11 cm. These small error rates were measured in controlled observations of real butterfly movement. Similar conclusions were reached using a ground-truth test track prepared with a field tape and compass and subsequently measured 20 times using the same methodology as the butterfly tracking. Median error in the ground-truth track was slightly higher than the field data, mostly between 20 and 30 cm, but even for the smallest ground-truth step (70 cm), this is still a signal-to-noise ratio of 3:1, and for steps of 3 m or more, the ratio is greater than 10:1. Such small errors relative to the movements being measured make these inexpensive units useful for measuring insect and other small animal movements on small to intermediate scales with budgets orders of magnitude lower than survey-grade units used in past studies. As an additional advantage, these units are simpler to operate, and insect or other small animal trackways can be collected more quickly than either survey-grade units or more traditional ruler/gird approaches.

  5. On P values and effect modification.

    PubMed

    Mayer, Martin

    2017-12-01

    A crucial element of evidence-based healthcare is the sound understanding and use of statistics. As part of instilling sound statistical knowledge and practice, it seems useful to highlight instances of unsound statistical reasoning or practice, not merely in captious or vitriolic spirit, but rather, to use such error as a springboard for edification by giving tangibility to the concepts at hand and highlighting the importance of avoiding such error. This article aims to provide an instructive overview of two key statistical concepts: effect modification and P values. A recent article published in the Journal of the American College of Cardiology on side effects related to statin therapy offers a notable example of errors in understanding effect modification and P values, and although not so critical as to entirely invalidate the article, the errors still demand considerable scrutiny and correction. In doing so, this article serves as an instructive overview of the statistical concepts of effect modification and P values. Judicious handling of statistics is imperative to avoid muddying their utility. This article contributes to the body of literature aiming to improve the use of statistics, which in turn will help facilitate evidence appraisal, synthesis, translation, and application.

  6. The p-Value You Can't Buy.

    PubMed

    Demidenko, Eugene

    2016-01-02

    There is growing frustration with the concept of the p -value. Besides having an ambiguous interpretation, the p- value can be made as small as desired by increasing the sample size, n . The p -value is outdated and does not make sense with big data: Everything becomes statistically significant. The root of the problem with the p- value is in the mean comparison. We argue that statistical uncertainty should be measured on the individual, not the group, level. Consequently, standard deviation (SD), not standard error (SE), error bars should be used to graphically present the data on two groups. We introduce a new measure based on the discrimination of individuals/objects from two groups, and call it the D -value. The D -value can be viewed as the n -of-1 p -value because it is computed in the same way as p while letting n equal 1. We show how the D -value is related to discrimination probability and the area above the receiver operating characteristic (ROC) curve. The D -value has a clear interpretation as the proportion of patients who get worse after the treatment, and as such facilitates to weigh up the likelihood of events under different scenarios. [Received January 2015. Revised June 2015.].

  7. Multiple Hypothesis Testing for Experimental Gingivitis Based on Wilcoxon Signed Rank Statistics

    PubMed Central

    Preisser, John S.; Sen, Pranab K.; Offenbacher, Steven

    2011-01-01

    Dental research often involves repeated multivariate outcomes on a small number of subjects for which there is interest in identifying outcomes that exhibit change in their levels over time as well as to characterize the nature of that change. In particular, periodontal research often involves the analysis of molecular mediators of inflammation for which multivariate parametric methods are highly sensitive to outliers and deviations from Gaussian assumptions. In such settings, nonparametric methods may be favored over parametric ones. Additionally, there is a need for statistical methods that control an overall error rate for multiple hypothesis testing. We review univariate and multivariate nonparametric hypothesis tests and apply them to longitudinal data to assess changes over time in 31 biomarkers measured from the gingival crevicular fluid in 22 subjects whereby gingivitis was induced by temporarily withholding tooth brushing. To identify biomarkers that can be induced to change, multivariate Wilcoxon signed rank tests for a set of four summary measures based upon area under the curve are applied for each biomarker and compared to their univariate counterparts. Multiple hypothesis testing methods with choice of control of the false discovery rate or strong control of the family-wise error rate are examined. PMID:21984957

  8. Bias estimation for moving optical sensor measurements with targets of opportunity

    NASA Astrophysics Data System (ADS)

    Belfadel, Djedjiga; Osborne, Richard W.; Bar-Shalom, Yaakov

    2014-06-01

    Integration of space based sensors into a Ballistic Missile Defense System (BMDS) allows for detection and tracking of threats over a larger area than ground based sensors [1]. This paper examines the effect of sensor bias error on the tracking quality of a Space Tracking and Surveillance System (STSS) for the highly non-linear problem of tracking a ballistic missile. The STSS constellation consists of two or more satellites (on known trajectories) for tracking ballistic targets. Each satellite is equipped with an IR sensor that provides azimuth and elevation to the target. The tracking problem is made more difficult due to a constant or slowly varying bias error present in each sensor's line of sight measurements. It is important to correct for these bias errors so that the multiple sensor measurements and/or tracks can be referenced as accurately as possible to a common tracking coordinate system. The measurements provided by these sensors are assumed time-coincident (synchronous) and perfectly associated. The line of sight (LOS) measurements from the sensors can be fused into measurements which are the Cartesian target position, i.e., linear in the target state. We evaluate the Cramér-Rao Lower Bound (CRLB) on the covariance of the bias estimates, which serves as a quantification of the available information about the biases. Statistical tests on the results of simulations show that this method is statistically efficient, even for small sample sizes (as few as two sensors and six points on the (unknown) trajectory of a single target of opportunity). We also show that the RMS position error is significantly improved with bias estimation compared with the target position estimation using the original biased measurements.

  9. Type I and Type II error concerns in fMRI research: re-balancing the scale

    PubMed Central

    Cunningham, William A.

    2009-01-01

    Statistical thresholding (i.e. P-values) in fMRI research has become increasingly conservative over the past decade in an attempt to diminish Type I errors (i.e. false alarms) to a level traditionally allowed in behavioral science research. In this article, we examine the unintended negative consequences of this single-minded devotion to Type I errors: increased Type II errors (i.e. missing true effects), a bias toward studying large rather than small effects, a bias toward observing sensory and motor processes rather than complex cognitive and affective processes and deficient meta-analyses. Power analyses indicate that the reductions in acceptable P-values over time are producing dramatic increases in the Type II error rate. Moreover, the push for a mapwide false discovery rate (FDR) of 0.05 is based on the assumption that this is the FDR in most behavioral research; however, this is an inaccurate assessment of the conventions in actual behavioral research. We report simulations demonstrating that combined intensity and cluster size thresholds such as P < 0.005 with a 10 voxel extent produce a desirable balance between Types I and II error rates. This joint threshold produces high but acceptable Type II error rates and produces a FDR that is comparable to the effective FDR in typical behavioral science articles (while a 20 voxel extent threshold produces an actual FDR of 0.05 with relatively common imaging parameters). We recommend a greater focus on replication and meta-analysis rather than emphasizing single studies as the unit of analysis for establishing scientific truth. From this perspective, Type I errors are self-erasing because they will not replicate, thus allowing for more lenient thresholding to avoid Type II errors. PMID:20035017

  10. A Complementary Note to 'A Lag-1 Smoother Approach to System-Error Estimation': The Intrinsic Limitations of Residual Diagnostics

    NASA Technical Reports Server (NTRS)

    Todling, Ricardo

    2015-01-01

    Recently, this author studied an approach to the estimation of system error based on combining observation residuals derived from a sequential filter and fixed lag-1 smoother. While extending the methodology to a variational formulation, experimenting with simple models and making sure consistency was found between the sequential and variational formulations, the limitations of the residual-based approach came clearly to the surface. This note uses the sequential assimilation application to simple nonlinear dynamics to highlight the issue. Only when some of the underlying error statistics are assumed known is it possible to estimate the unknown component. In general, when considerable uncertainties exist in the underlying statistics as a whole, attempts to obtain separate estimates of the various error covariances are bound to lead to misrepresentation of errors. The conclusions are particularly relevant to present-day attempts to estimate observation-error correlations from observation residual statistics. A brief illustration of the issue is also provided by comparing estimates of error correlations derived from a quasi-operational assimilation system and a corresponding Observing System Simulation Experiments framework.

  11. Limitations of Airway Dimension Measurement on Images Obtained Using Multi-Detector Row Computed Tomography

    PubMed Central

    Oguma, Tsuyoshi; Hirai, Toyohiro; Niimi, Akio; Matsumoto, Hisako; Muro, Shigeo; Shigematsu, Michio; Nishimura, Takashi; Kubo, Yoshiro; Mishima, Michiaki

    2013-01-01

    Objectives (a) To assess the effects of computed tomography (CT) scanners, scanning conditions, airway size, and phantom composition on airway dimension measurement and (b) to investigate the limitations of accurate quantitative assessment of small airways using CT images. Methods An airway phantom, which was constructed using various types of material and with various tube sizes, was scanned using four CT scanner types under different conditions to calculate airway dimensions, luminal area (Ai), and the wall area percentage (WA%). To investigate the limitations of accurate airway dimension measurement, we then developed a second airway phantom with a thinner tube wall, and compared the clinical CT images of healthy subjects with the phantom images scanned using the same CT scanner. The study using clinical CT images was approved by the local ethics committee, and written informed consent was obtained from all subjects. Data were statistically analyzed using one-way ANOVA. Results Errors noted in airway dimension measurement were greater in the tube of small inner radius made of material with a high CT density and on images reconstructed by body algorithm (p<0.001), and there was some variation in error among CT scanners under different fields of view. Airway wall thickness had the maximum effect on the accuracy of measurements with all CT scanners under all scanning conditions, and the magnitude of errors for WA% and Ai varied depending on wall thickness when airways of <1.0-mm wall thickness were measured. Conclusions The parameters of airway dimensions measured were affected by airway size, reconstruction algorithm, composition of the airway phantom, and CT scanner types. In dimension measurement of small airways with wall thickness of <1.0 mm, the accuracy of measurement according to quantitative CT parameters can decrease as the walls become thinner. PMID:24116105

  12. Empirical performance of interpolation techniques in risk-neutral density (RND) estimation

    NASA Astrophysics Data System (ADS)

    Bahaludin, H.; Abdullah, M. H.

    2017-03-01

    The objective of this study is to evaluate the empirical performance of interpolation techniques in risk-neutral density (RND) estimation. Firstly, the empirical performance is evaluated by using statistical analysis based on the implied mean and the implied variance of RND. Secondly, the interpolation performance is measured based on pricing error. We propose using the leave-one-out cross-validation (LOOCV) pricing error for interpolation selection purposes. The statistical analyses indicate that there are statistical differences between the interpolation techniques:second-order polynomial, fourth-order polynomial and smoothing spline. The results of LOOCV pricing error shows that interpolation by using fourth-order polynomial provides the best fitting to option prices in which it has the lowest value error.

  13. Statistics of the radiated field of a space-to-earth microwave power transfer system

    NASA Technical Reports Server (NTRS)

    Stevens, G. H.; Leininger, G.

    1976-01-01

    Statistics such as average power density pattern, variance of the power density pattern and variance of the beam pointing error are related to hardware parameters such as transmitter rms phase error and rms amplitude error. Also a limitation on spectral width of the phase reference for phase control was established. A 1 km diameter transmitter appears feasible provided the total rms insertion phase errors of the phase control modules does not exceed 10 deg, amplitude errors do not exceed 10% rms, and the phase reference spectral width does not exceed approximately 3 kHz. With these conditions the expected radiation pattern is virtually the same as the error free pattern, and the rms beam pointing error would be insignificant (approximately 10 meters).

  14. Predictors of Errors of Novice Java Programmers

    ERIC Educational Resources Information Center

    Bringula, Rex P.; Manabat, Geecee Maybelline A.; Tolentino, Miguel Angelo A.; Torres, Edmon L.

    2012-01-01

    This descriptive study determined which of the sources of errors would predict the errors committed by novice Java programmers. Descriptive statistics revealed that the respondents perceived that they committed the identified eighteen errors infrequently. Thought error was perceived to be the main source of error during the laboratory programming…

  15. Syndromic surveillance for health information system failures: a feasibility study.

    PubMed

    Ong, Mei-Sing; Magrabi, Farah; Coiera, Enrico

    2013-05-01

    To explore the applicability of a syndromic surveillance method to the early detection of health information technology (HIT) system failures. A syndromic surveillance system was developed to monitor a laboratory information system at a tertiary hospital. Four indices were monitored: (1) total laboratory records being created; (2) total records with missing results; (3) average serum potassium results; and (4) total duplicated tests on a patient. The goal was to detect HIT system failures causing: data loss at the record level; data loss at the field level; erroneous data; and unintended duplication of data. Time-series models of the indices were constructed, and statistical process control charts were used to detect unexpected behaviors. The ability of the models to detect HIT system failures was evaluated using simulated failures, each lasting for 24 h, with error rates ranging from 1% to 35%. In detecting data loss at the record level, the model achieved a sensitivity of 0.26 when the simulated error rate was 1%, while maintaining a specificity of 0.98. Detection performance improved with increasing error rates, achieving a perfect sensitivity when the error rate was 35%. In the detection of missing results, erroneous serum potassium results and unintended repetition of tests, perfect sensitivity was attained when the error rate was as small as 5%. Decreasing the error rate to 1% resulted in a drop in sensitivity to 0.65-0.85. Syndromic surveillance methods can potentially be applied to monitor HIT systems, to facilitate the early detection of failures.

  16. An Artificial Intelligence Approach to Analyzing Student Errors in Statistics.

    ERIC Educational Resources Information Center

    Sebrechts, Marc M.; Schooler, Lael J.

    1987-01-01

    Describes the development of an artificial intelligence system called GIDE that analyzes student errors in statistics problems by inferring the students' intentions. Learning strategies involved in problem solving are discussed and the inclusion of goal structures is explained. (LRW)

  17. Statistical error model for a solar electric propulsion thrust subsystem

    NASA Technical Reports Server (NTRS)

    Bantell, M. H.

    1973-01-01

    The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.

  18. Empirical investigation into depth-resolution of Magnetotelluric data

    NASA Astrophysics Data System (ADS)

    Piana Agostinetti, N.; Ogaya, X.

    2017-12-01

    We investigate the depth-resolution of MT data comparing reconstructed 1D resistivity profiles with measured resistivity and lithostratigraphy from borehole data. Inversion of MT data has been widely used to reconstruct the 1D fine-layered resistivity structure beneath an isolated Magnetotelluric (MT) station. Uncorrelated noise is generally assumed to be associated to MT data. However, wrong assumptions on error statistics have been proved to strongly bias the results obtained in geophysical inversions. In particular the number of resolved layers at depth strongly depends on error statistics. In this study, we applied a trans-dimensional McMC algorithm for reconstructing the 1D resistivity profile near-by the location of a 1500 m-deep borehole, using MT data. We resolve the MT inverse problem imposing different models for the error statistics associated to the MT data. Following a Hierachical Bayes' approach, we also inverted for the hyper-parameters associated to each error statistics model. Preliminary results indicate that assuming un-correlated noise leads to a number of resolved layers larger than expected from the retrieved lithostratigraphy. Moreover, comparing the inversion of synthetic resistivity data obtained from the "true" resistivity stratification measured along the borehole shows that a consistent number of resistivity layers can be obtained using a Gaussian model for the error statistics, with substantial correlation length.

  19. Generating a Magellanic star cluster catalog with ASteCA

    NASA Astrophysics Data System (ADS)

    Perren, G. I.; Piatti, A. E.; Vázquez, R. A.

    2016-08-01

    An increasing number of software tools have been employed in the recent years for the automated or semi-automated processing of astronomical data. The main advantages of using these tools over a standard by-eye analysis include: speed (particularly for large databases), homogeneity, reproducibility, and precision. At the same time, they enable a statistically correct study of the uncertainties associated with the analysis, in contrast with manually set errors, or the still widespread practice of simply not assigning errors. We present a catalog comprising 210 star clusters located in the Large and Small Magellanic Clouds, observed with Washington photometry. Their fundamental parameters were estimated through an homogeneous, automatized and completely unassisted process, via the Automated Stellar Cluster Analysis package ( ASteCA). Our results are compared with two types of studies on these clusters: one where the photometry is the same, and another where the photometric system is different than that employed by ASteCA.

  20. Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches.

    PubMed

    Bishara, Anthony J; Hittner, James B

    2012-09-01

    It is well known that when data are nonnormally distributed, a test of the significance of Pearson's r may inflate Type I error rates and reduce power. Statistics textbooks and the simulation literature provide several alternatives to Pearson's correlation. However, the relative performance of these alternatives has been unclear. Two simulation studies were conducted to compare 12 methods, including Pearson, Spearman's rank-order, transformation, and resampling approaches. With most sample sizes (n ≥ 20), Type I and Type II error rates were minimized by transforming the data to a normal shape prior to assessing the Pearson correlation. Among transformation approaches, a general purpose rank-based inverse normal transformation (i.e., transformation to rankit scores) was most beneficial. However, when samples were both small (n ≤ 10) and extremely nonnormal, the permutation test often outperformed other alternatives, including various bootstrap tests.

  1. Calculating radiotherapy margins based on Bayesian modelling of patient specific random errors

    NASA Astrophysics Data System (ADS)

    Herschtal, A.; te Marvelde, L.; Mengersen, K.; Hosseinifard, Z.; Foroudi, F.; Devereux, T.; Pham, D.; Ball, D.; Greer, P. B.; Pichler, P.; Eade, T.; Kneebone, A.; Bell, L.; Caine, H.; Hindson, B.; Kron, T.

    2015-02-01

    Collected real-life clinical target volume (CTV) displacement data show that some patients undergoing external beam radiotherapy (EBRT) demonstrate significantly more fraction-to-fraction variability in their displacement (‘random error’) than others. This contrasts with the common assumption made by historical recipes for margin estimation for EBRT, that the random error is constant across patients. In this work we present statistical models of CTV displacements in which random errors are characterised by an inverse gamma (IG) distribution in order to assess the impact of random error variability on CTV-to-PTV margin widths, for eight real world patient cohorts from four institutions, and for different sites of malignancy. We considered a variety of clinical treatment requirements and penumbral widths. The eight cohorts consisted of a total of 874 patients and 27 391 treatment sessions. Compared to a traditional margin recipe that assumes constant random errors across patients, for a typical 4 mm penumbral width, the IG based margin model mandates that in order to satisfy the common clinical requirement that 90% of patients receive at least 95% of prescribed RT dose to the entire CTV, margins be increased by a median of 10% (range over the eight cohorts -19% to +35%). This substantially reduces the proportion of patients for whom margins are too small to satisfy clinical requirements.

  2. The use of deep convective clouds to uniformly calibrate the next generation of geostationary reflective solar imagers

    NASA Astrophysics Data System (ADS)

    Doelling, David R.; Bhatt, Rajendra; Haney, Conor O.; Gopalan, Arun; Scarino, Benjamin R.

    2017-09-01

    The new 3rd generation geostationary (GEO) imagers will have many of the same NPP-VIIRS imager spectral bands, thereby offering the opportunity to apply the VIIRS cloud, aerosol, and land use retrieval algorithms on the new GEO imager measurements. Climate quality retrievals require multi-channel calibrated radiances that are stable over time. The deep convective cloud calibration technique (DCCT) is a large ensemble statistical technique that assumes that the DCC reflectance is stable over time. Because DCC are found in sufficient numbers across all GEO domains, they provide a uniform calibration stability evaluation across the GEO constellation. The baseline DCCT has been successful in calibrating visible and near-infrared channels. However, for shortwave infrared (SWIR) channels the DCCT is not as effective to monitor radiometric stability. The DCCT was optimized as a function wavelength in this paper. For SWIR bands, the greatest reduction of the DCC response trend standard error was achieved through deseasonalization. This is effective because the DCC reflectance exhibits small regional seasonal cycles that can be characterized on a monthly basis. On the other hand, the inter-annually variability in DCC response was found to be extremely small. The Met-9 0.65-μm channel DCC response was found to have a 3% seasonal cycle. Deseasonalization reduced the trend standard error from 1% to 0.4%. For the NPP-VIIRS SWIR bands, deseasonalization reduced the trend standard error by more than half. All VIIRS SWIR band trend standard errors were less than 1%. The DCCT should be able to monitor the stability of all GEO imager solar reflective bands across the tropical domain with the same uniform accuracy.

  3. Linearised and non-linearised isotherm models optimization analysis by error functions and statistical means

    PubMed Central

    2014-01-01

    In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm. PMID:25018878

  4. Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions

    ERIC Educational Resources Information Center

    Yormaz, Seha; Sünbül, Önder

    2017-01-01

    This study aims to determine the Type I error rates and power of S[subscript 1] , S[subscript 2] indices and kappa statistic at detecting copying on multiple-choice tests under various conditions. It also aims to determine how copying groups are created in order to calculate how kappa statistics affect Type I error rates and power. In this study,…

  5. An analytic technique for statistically modeling random atomic clock errors in estimation

    NASA Technical Reports Server (NTRS)

    Fell, P. J.

    1981-01-01

    Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.

  6. An optical/NIR survey of globular clusters in early-type galaxies. III. On the colour bimodality of globular cluster systems

    NASA Astrophysics Data System (ADS)

    Chies-Santos, A. L.; Larsen, S. S.; Cantiello, M.; Strader, J.; Kuntschner, H.; Wehner, E. M.; Brodie, J. P.

    2012-03-01

    Context. The interpretation that bimodal colour distributions of globular clusters (GCs) reflect bimodal metallicity distributions has been challenged. Non-linearities in the colour to metallicity conversions caused for example by the horizontal branch (HB) stars may be responsible for transforming a unimodal metallicity distribution into a bimodal (optical) colour distribution. Aims: We study optical/near-infrared (NIR) colour distributions of the GC systems in 14 E/S0 galaxies. Methods: We test whether the bimodal feature, generally present in optical colour distributions, remains in the optical/NIR ones. The latter colour combination is a better metallicity proxy than the former. We use KMM and GMM tests to quantify the probability that different colour distributions are better described by a bimodal, as opposed to a unimodal distribution. Results: We find that double-peaked colour distributions are more commonly seen in optical than in optical/NIR colours. For some of the galaxies where the optical (g - z) distribution is clearly bimodal, a bimodal distribution is not preferred over a unimodal one at a statistically significant level for the (g - K) and (z - K) distributions. The two most cluster-rich galaxies in our sample, NGC 4486 and NGC 4649, show some interesting differences. The (g - K) distribution of NGC 4649 is better described by a bimodal distribution, while this is true for the (g - K) distribution of NGC 4486 GCs only if restricted to a brighter sub-sample with small K-band errors (<0.05 mag). Formally, the K-band photometric errors cannot be responsible for blurring bimodal metallicity distributions to unimodal (g - K) colour distributions. However, simulations including the extra scatter in the colour-colour diagrams (not fully accounted for in the photometric errors) show that such scatter may contribute to the disappearance of bimodality in (g - K) for the full NGC 4486 sample. For the less cluster-rich galaxies results are inconclusive due to poorer statistics. Conclusions: A bimodal optical colour distribution is not necessarily an indication of an underlying bimodal metallicity distribution. Horizontal branch morphology may play an important role in shaping some of the optical GC colour distributions. However, we find tentative evidence that the (g - K) colour distributions remain bimodal in the two cluster-rich galaxies in our sample (NGC 4486 and NGC 4649) when restricted to clusters with small K-band photometric errors. This bimodality becomes less pronounced when including objects with larger errors, or for the (z - K) colour distributions. Deeper observations of large numbers of GCs will be required to reach more secure conclusions.

  7. Evaluation of assumptions in soil moisture triple collocation analysis

    USDA-ARS?s Scientific Manuscript database

    Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually-independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and ort...

  8. Issues with data and analyses: Errors, underlying themes, and potential solutions

    PubMed Central

    Allison, David B.

    2018-01-01

    Some aspects of science, taken at the broadest level, are universal in empirical research. These include collecting, analyzing, and reporting data. In each of these aspects, errors can and do occur. In this work, we first discuss the importance of focusing on statistical and data errors to continually improve the practice of science. We then describe underlying themes of the types of errors and postulate contributing factors. To do so, we describe a case series of relatively severe data and statistical errors coupled with surveys of some types of errors to better characterize the magnitude, frequency, and trends. Having examined these errors, we then discuss the consequences of specific errors or classes of errors. Finally, given the extracted themes, we discuss methodological, cultural, and system-level approaches to reducing the frequency of commonly observed errors. These approaches will plausibly contribute to the self-critical, self-correcting, ever-evolving practice of science, and ultimately to furthering knowledge. PMID:29531079

  9. Analysis of the Los Angeles Basin ground subsidence with InSAR data by independent component analysis approach

    NASA Astrophysics Data System (ADS)

    Xu, B.

    2017-12-01

    Interferometric Synthetic Aperture Radar (InSAR) has the advantages of high spatial resolution which enable measure line of sight (LOS) surface displacements with nearly complete spatial continuity and a satellite's perspective that permits large areas view of Earth's surface quickly and efficiently. However, using InSAR to observe long wavelength and small magnitude deformation signals is still significantly limited by various unmodeled errors sources i.e. atmospheric delays, orbit induced errors, Digital Elevation Model (DEM) errors. Independent component analysis (ICA) is a probabilistic method for separating linear mixed signals generated by different underlying physical processes.The signal sources which form the interferograms are statistically independent both in space and in time, thus, they can be separated by ICA approach.The seismic behavior in the Los Angeles Basin is active and the basin has experienced numerous moderate to large earthquakes since the early Pliocene. Hence, understanding the seismotectonic deformation in the Los Angeles Basin is important for analyzing seismic behavior. Compare with the tectonic deformations, nontectonic deformations due to groundwater and oil extraction may be mainly responsible for the surface deformation in the Los Angeles basin. Using the small baseline subset (SBAS) InSAR method, we extracted the surface deformation time series in the Los Angeles basin with a time span of 7 years (September 27, 2003-September 25,2010). Then, we successfully separate the atmospheric noise from InSAR time series and detect different processes caused by different mechanisms.

  10. Microscopic saw mark analysis: an empirical approach.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2015-01-01

    Microscopic saw mark analysis is a well published and generally accepted qualitative analytical method. However, little research has focused on identifying and mitigating potential sources of error associated with the method. The presented study proposes the use of classification trees and random forest classifiers as an optimal, statistically sound approach to mitigate the potential for error of variability and outcome error in microscopic saw mark analysis. The statistical model was applied to 58 experimental saw marks created with four types of saws. The saw marks were made in fresh human femurs obtained through anatomical gift and were analyzed using a Keyence digital microscope. The statistical approach weighed the variables based on discriminatory value and produced decision trees with an associated outcome error rate of 8.62-17.82%. © 2014 American Academy of Forensic Sciences.

  11. Interpreting SBUV Smoothing Errors: an Example Using the Quasi-biennial Oscillation

    NASA Technical Reports Server (NTRS)

    Kramarova, N. A.; Bhartia, Pawan K.; Frith, S. M.; McPeters, R. D.; Stolarski, R. S.

    2013-01-01

    The Solar Backscattered Ultraviolet (SBUV) observing system consists of a series of instruments that have been measuring both total ozone and the ozone profile since 1970. SBUV measures the profile in the upper stratosphere with a resolution that is adequate to resolve most of the important features of that region. In the lower stratosphere the limited vertical resolution of the SBUV system means that there are components of the profile variability that SBUV cannot measure. The smoothing error, as defined in the optimal estimation retrieval method, describes the components of the profile variability that the SBUV observing system cannot measure. In this paper we provide a simple visual interpretation of the SBUV smoothing error by comparing SBUV ozone anomalies in the lower tropical stratosphere associated with the quasi-biennial oscillation (QBO) to anomalies obtained from the Aura Microwave Limb Sounder (MLS). We describe a methodology for estimating the SBUV smoothing error for monthly zonal mean (mzm) profiles. We construct covariance matrices that describe the statistics of the inter-annual ozone variability using a 6 yr record of Aura MLS and ozonesonde data. We find that the smoothing error is of the order of 1percent between 10 and 1 hPa, increasing up to 15-20 percent in the troposphere and up to 5 percent in the mesosphere. The smoothing error for total ozone columns is small, mostly less than 0.5 percent. We demonstrate that by merging the partial ozone columns from several layers in the lower stratosphere/troposphere into one thick layer, we can minimize the smoothing error. We recommend using the following layer combinations to reduce the smoothing error to about 1 percent: surface to 25 hPa (16 hPa) outside (inside) of the narrow equatorial zone 20 S-20 N.

  12. Bayes Error Rate Estimation Using Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2003-01-01

    The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

  13. Analytical study of the effects of soft tissue artefacts on functional techniques to define axes of rotation.

    PubMed

    De Rosario, Helios; Page, Álvaro; Besa, Antonio

    2017-09-06

    The accurate location of the main axes of rotation (AoR) is a crucial step in many applications of human movement analysis. There are different formal methods to determine the direction and position of the AoR, whose performance varies across studies, depending on the pose and the source of errors. Most methods are based on minimizing squared differences between observed and modelled marker positions or rigid motion parameters, implicitly assuming independent and uncorrelated errors, but the largest error usually results from soft tissue artefacts (STA), which do not have such statistical properties and are not effectively cancelled out by such methods. However, with adequate methods it is possible to assume that STA only account for a small fraction of the observed motion and to obtain explicit formulas through differential analysis that relate STA components to the resulting errors in AoR parameters. In this paper such formulas are derived for three different functional calibration techniques (Geometric Fitting, mean Finite Helical Axis, and SARA), to explain why each technique behaves differently from the others, and to propose strategies to compensate for those errors. These techniques were tested with published data from a sit-to-stand activity, where the true axis was defined using bi-planar fluoroscopy. All the methods were able to estimate the direction of the AoR with an error of less than 5°, whereas there were errors in the location of the axis of 30-40mm. Such location errors could be reduced to less than 17mm by the methods based on equations that use rigid motion parameters (mean Finite Helical Axis, SARA) when the translation component was calculated using the three markers nearest to the axis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Computing the binding affinity of a ligand buried deep inside a protein with the hybrid steered molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Villarreal, Oscar D.; Yu, Lili; Department of Laboratory Medicine, Yancheng Vocational Institute of Health Sciences, Yancheng, Jiangsu 224006

    Computing the ligand-protein binding affinity (or the Gibbs free energy) with chemical accuracy has long been a challenge for which many methods/approaches have been developed and refined with various successful applications. False positives and, even more harmful, false negatives have been and still are a common occurrence in practical applications. Inevitable in all approaches are the errors in the force field parameters we obtain from quantum mechanical computation and/or empirical fittings for the intra- and inter-molecular interactions. These errors propagate to the final results of the computed binding affinities even if we were able to perfectly implement the statistical mechanicsmore » of all the processes relevant to a given problem. And they are actually amplified to various degrees even in the mature, sophisticated computational approaches. In particular, the free energy perturbation (alchemical) approaches amplify the errors in the force field parameters because they rely on extracting the small differences between similarly large numbers. In this paper, we develop a hybrid steered molecular dynamics (hSMD) approach to the difficult binding problems of a ligand buried deep inside a protein. Sampling the transition along a physical (not alchemical) dissociation path of opening up the binding cavity- -pulling out the ligand- -closing back the cavity, we can avoid the problem of error amplifications by not relying on small differences between similar numbers. We tested this new form of hSMD on retinol inside cellular retinol-binding protein 1 and three cases of a ligand (a benzylacetate, a 2-nitrothiophene, and a benzene) inside a T4 lysozyme L99A/M102Q(H) double mutant. In all cases, we obtained binding free energies in close agreement with the experimentally measured values. This indicates that the force field parameters we employed are accurate and that hSMD (a brute force, unsophisticated approach) is free from the problem of error amplification suffered by many sophisticated approaches in the literature.« less

  15. Investigating the role of background and observation error correlations in improving a model forecast of forest carbon balance using four dimensional variational data assimilation.

    NASA Astrophysics Data System (ADS)

    Pinnington, Ewan; Casella, Eric; Dance, Sarah; Lawless, Amos; Morison, James; Nichols, Nancy; Wilkinson, Matthew; Quaife, Tristan

    2016-04-01

    Forest ecosystems play an important role in sequestering human emitted carbon-dioxide from the atmosphere and therefore greatly reduce the effect of anthropogenic induced climate change. For that reason understanding their response to climate change is of great importance. Efforts to implement variational data assimilation routines with functional ecology models and land surface models have been limited, with sequential and Markov chain Monte Carlo data assimilation methods being prevalent. When data assimilation has been used with models of carbon balance, background "prior" errors and observation errors have largely been treated as independent and uncorrelated. Correlations between background errors have long been known to be a key aspect of data assimilation in numerical weather prediction. More recently, it has been shown that accounting for correlated observation errors in the assimilation algorithm can considerably improve data assimilation results and forecasts. In this paper we implement a 4D-Var scheme with a simple model of forest carbon balance, for joint parameter and state estimation and assimilate daily observations of Net Ecosystem CO2 Exchange (NEE) taken at the Alice Holt forest CO2 flux site in Hampshire, UK. We then investigate the effect of specifying correlations between parameter and state variables in background error statistics and the effect of specifying correlations in time between observation error statistics. The idea of including these correlations in time is new and has not been previously explored in carbon balance model data assimilation. In data assimilation, background and observation error statistics are often described by the background error covariance matrix and the observation error covariance matrix. We outline novel methods for creating correlated versions of these matrices, using a set of previously postulated dynamical constraints to include correlations in the background error statistics and a Gaussian correlation function to include time correlations in the observation error statistics. The methods used in this paper will allow the inclusion of time correlations between many different observation types in the assimilation algorithm, meaning that previously neglected information can be accounted for. In our experiments we compared the results using our new correlated background and observation error covariance matrices and those using diagonal covariance matrices. We found that using the new correlated matrices reduced the root mean square error in the 14 year forecast of daily NEE by 44 % decreasing from 4.22 g C m-2 day-1 to 2.38 g C m-2 day-1.

  16. Modeling the subfilter scalar variance for large eddy simulation in forced isotropic turbulence

    NASA Astrophysics Data System (ADS)

    Cheminet, Adam; Blanquart, Guillaume

    2011-11-01

    Static and dynamic model for the subfilter scalar variance in homogeneous isotropic turbulence are investigated using direct numerical simulations (DNS) of a lineary forced passive scalar field. First, we introduce a new scalar forcing technique conditioned only on the scalar field which allows the fluctuating scalar field to reach a statistically stationary state. Statistical properties, including 2nd and 3rd statistical moments, spectra, and probability density functions of the scalar field have been analyzed. Using this technique, we performed constant density and variable density DNS of scalar mixing in isotropic turbulence. The results are used in an a-priori study of scalar variance models. Emphasis is placed on further studying the dynamic model introduced by G. Balarac, H. Pitsch and V. Raman [Phys. Fluids 20, (2008)]. Scalar variance models based on Bedford and Yeo's expansion are accurate for small filter width but errors arise in the inertial subrange. Results suggest that a constant coefficient computed from an assumed Kolmogorov spectrum is often sufficient to predict the subfilter scalar variance.

  17. Statistical rice yield modeling using blended MODIS-Landsat based crop phenology metrics in Taiwan

    NASA Astrophysics Data System (ADS)

    Chen, C. R.; Chen, C. F.; Nguyen, S. T.; Lau, K. V.

    2015-12-01

    Taiwan is a populated island with a majority of residents settled in the western plains where soils are suitable for rice cultivation. Rice is not only the most important commodity, but also plays a critical role for agricultural and food marketing. Information of rice production is thus important for policymakers to devise timely plans for ensuring sustainably socioeconomic development. Because rice fields in Taiwan are generally small and yet crop monitoring requires information of crop phenology associating with the spatiotemporal resolution of satellite data, this study used Landsat-MODIS fusion data for rice yield modeling in Taiwan. We processed the data for the first crop (Feb-Mar to Jun-Jul) and the second (Aug-Sep to Nov-Dec) in 2014 through five main steps: (1) data pre-processing to account for geometric and radiometric errors of Landsat data, (2) Landsat-MODIS data fusion using using the spatial-temporal adaptive reflectance fusion model, (3) construction of the smooth time-series enhanced vegetation index 2 (EVI2), (4) rice yield modeling using EVI2-based crop phenology metrics, and (5) error verification. The fusion results by a comparison bewteen EVI2 derived from the fusion image and that from the reference Landsat image indicated close agreement between the two datasets (R2 > 0.8). We analysed smooth EVI2 curves to extract phenology metrics or phenological variables for establishment of rice yield models. The results indicated that the established yield models significantly explained more than 70% variability in the data (p-value < 0.001). The comparison results between the estimated yields and the government's yield statistics for the first and second crops indicated a close significant relationship between the two datasets (R2 > 0.8), in both cases. The root mean square error (RMSE) and mean absolute error (MAE) used to measure the model accuracy revealed the consistency between the estimated yields and the government's yield statistics. This study demonstrates advantages of using EVI2-based phenology metrics (derived from Landsat-MODIS fusion data) for rice yield estimation in Taiwan prior to the harvest period.

  18. Measurement-device-independent quantum key distribution with source state errors and statistical fluctuation

    NASA Astrophysics Data System (ADS)

    Jiang, Cong; Yu, Zong-Wen; Wang, Xiang-Bin

    2017-03-01

    We show how to calculate the secure final key rate in the four-intensity decoy-state measurement-device-independent quantum key distribution protocol with both source errors and statistical fluctuations with a certain failure probability. Our results rely only on the range of only a few parameters in the source state. All imperfections in this protocol have been taken into consideration without assuming any specific error patterns of the source.

  19. Error Analysis for RADAR Neighbor Matching Localization in Linear Logarithmic Strength Varying Wi-Fi Environment

    PubMed Central

    Tian, Zengshan; Xu, Kunjie; Yu, Xiang

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future. PMID:24683349

  20. Error analysis for RADAR neighbor matching localization in linear logarithmic strength varying Wi-Fi environment.

    PubMed

    Zhou, Mu; Tian, Zengshan; Xu, Kunjie; Yu, Xiang; Wu, Haibo

    2014-01-01

    This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future.

  1. Local indicators of geocoding accuracy (LIGA): theory and application

    PubMed Central

    Jacquez, Geoffrey M; Rommel, Robert

    2009-01-01

    Background Although sources of positional error in geographic locations (e.g. geocoding error) used for describing and modeling spatial patterns are widely acknowledged, research on how such error impacts the statistical results has been limited. In this paper we explore techniques for quantifying the perturbability of spatial weights to different specifications of positional error. Results We find that a family of curves describes the relationship between perturbability and positional error, and use these curves to evaluate sensitivity of alternative spatial weight specifications to positional error both globally (when all locations are considered simultaneously) and locally (to identify those locations that would benefit most from increased geocoding accuracy). We evaluate the approach in simulation studies, and demonstrate it using a case-control study of bladder cancer in south-eastern Michigan. Conclusion Three results are significant. First, the shape of the probability distributions of positional error (e.g. circular, elliptical, cross) has little impact on the perturbability of spatial weights, which instead depends on the mean positional error. Second, our methodology allows researchers to evaluate the sensitivity of spatial statistics to positional accuracy for specific geographies. This has substantial practical implications since it makes possible routine sensitivity analysis of spatial statistics to positional error arising in geocoded street addresses, global positioning systems, LIDAR and other geographic data. Third, those locations with high perturbability (most sensitive to positional error) and high leverage (that contribute the most to the spatial weight being considered) will benefit the most from increased positional accuracy. These are rapidly identified using a new visualization tool we call the LIGA scatterplot. Herein lies a paradox for spatial analysis: For a given level of positional error increasing sample density to more accurately follow the underlying population distribution increases perturbability and introduces error into the spatial weights matrix. In some studies positional error may not impact the statistical results, and in others it might invalidate the results. We therefore must understand the relationships between positional accuracy and the perturbability of the spatial weights in order to have confidence in a study's results. PMID:19863795

  2. Scout trajectory error propagation computer program

    NASA Technical Reports Server (NTRS)

    Myler, T. R.

    1982-01-01

    Since 1969, flight experience has been used as the basis for predicting Scout orbital accuracy. The data used for calculating the accuracy consists of errors in the trajectory parameters (altitude, velocity, etc.) at stage burnout as observed on Scout flights. Approximately 50 sets of errors are used in Monte Carlo analysis to generate error statistics in the trajectory parameters. A covariance matrix is formed which may be propagated in time. The mechanization of this process resulted in computer program Scout Trajectory Error Propagation (STEP) and is described herein. Computer program STEP may be used in conjunction with the Statistical Orbital Analysis Routine to generate accuracy in the orbit parameters (apogee, perigee, inclination, etc.) based upon flight experience.

  3. How to Create Automatically Graded Spreadsheets for Statistics Courses

    ERIC Educational Resources Information Center

    LoSchiavo, Frank M.

    2016-01-01

    Instructors often use spreadsheet software (e.g., Microsoft Excel) in their statistics courses so that students can gain experience conducting computerized analyses. Unfortunately, students tend to make several predictable errors when programming spreadsheets. Without immediate feedback, programming errors are likely to go undetected, and as a…

  4. Phase error statistics of a phase-locked loop synchronized direct detection optical PPM communication system

    NASA Technical Reports Server (NTRS)

    Natarajan, Suresh; Gardner, C. S.

    1987-01-01

    Receiver timing synchronization of an optical Pulse-Position Modulation (PPM) communication system can be achieved using a phased-locked loop (PLL), provided the photodetector output is suitably processed. The magnitude of the PLL phase error is a good indicator of the timing error at the receiver decoder. The statistics of the phase error are investigated while varying several key system parameters such as PPM order, signal and background strengths, and PPL bandwidth. A practical optical communication system utilizing a laser diode transmitter and an avalanche photodiode in the receiver is described, and the sampled phase error data are presented. A linear regression analysis is applied to the data to obtain estimates of the relational constants involving the phase error variance and incident signal power.

  5. Meta-analysis inside and outside particle physics: two traditions that should converge?

    PubMed

    Baker, Rose D; Jackson, Dan

    2013-06-01

    The use of meta-analysis in medicine and epidemiology really took off in the 1970s. However, in high-energy physics, the Particle Data Group has been carrying out meta-analyses of measurements of particle masses and other properties since 1957. Curiously, there has been virtually no interaction between those working inside and outside particle physics. In this paper, we use statistical models to study two major differences in practice. The first is the usefulness of systematic errors, which physicists are now beginning to quote in addition to statistical errors. The second is whether it is better to treat heterogeneity by scaling up errors as do the Particle Data Group or by adding a random effect as does the rest of the community. Besides fitting models, we derive and use an exact test of the error-scaling hypothesis. We also discuss the other methodological differences between the two streams of meta-analysis. Our conclusion is that systematic errors are not currently very useful and that the conventional random effects model, as routinely used in meta-analysis, has a useful role to play in particle physics. The moral we draw for statisticians is that we should be more willing to explore 'grassroots' areas of statistical application, so that good statistical practice can flow both from and back to the statistical mainstream. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.

  6. Assessing colour-dependent occupation statistics inferred from galaxy group catalogues

    NASA Astrophysics Data System (ADS)

    Campbell, Duncan; van den Bosch, Frank C.; Hearin, Andrew; Padmanabhan, Nikhil; Berlind, Andreas; Mo, H. J.; Tinker, Jeremy; Yang, Xiaohu

    2015-09-01

    We investigate the ability of current implementations of galaxy group finders to recover colour-dependent halo occupation statistics. To test the fidelity of group catalogue inferred statistics, we run three different group finders used in the literature over a mock that includes galaxy colours in a realistic manner. Overall, the resulting mock group catalogues are remarkably similar, and most colour-dependent statistics are recovered with reasonable accuracy. However, it is also clear that certain systematic errors arise as a consequence of correlated errors in group membership determination, central/satellite designation, and halo mass assignment. We introduce a new statistic, the halo transition probability (HTP), which captures the combined impact of all these errors. As a rule of thumb, errors tend to equalize the properties of distinct galaxy populations (i.e. red versus blue galaxies or centrals versus satellites), and to result in inferred occupation statistics that are more accurate for red galaxies than for blue galaxies. A statistic that is particularly poorly recovered from the group catalogues is the red fraction of central galaxies as a function of halo mass. Group finders do a good job in recovering galactic conformity, but also have a tendency to introduce weak conformity when none is present. We conclude that proper inference of colour-dependent statistics from group catalogues is best achieved using forward modelling (i.e. running group finders over mock data) or by implementing a correction scheme based on the HTP, as long as the latter is not too strongly model dependent.

  7. Real-Time Identification of Wheel Terrain Interaction Models for Enhanced Autonomous Vehicle Mobility

    DTIC Science & Technology

    2014-04-24

    tim at io n Er ro r ( cm ) 0 2 4 6 8 10 Color Statistics Angelova...Color_Statistics_Error) / Average_Slip_Error Position Estimation Error: Global Pose Po si tio n Es tim at io n Er ro r ( cm ) 0 2 4 6 8 10 12 Color...get some kind of clearance for releasing pose and odometry data) collected at the following sites – Taylor, Gascola, Somerset, Fort Bliss and

  8. Linear and Order Statistics Combiners for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

    2001-01-01

    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

  9. The GEOS Ozone Data Assimilation System: Specification of Error Statistics

    NASA Technical Reports Server (NTRS)

    Stajner, Ivanka; Riishojgaard, Lars Peter; Rood, Richard B.

    2000-01-01

    A global three-dimensional ozone data assimilation system has been developed at the Data Assimilation Office of the NASA/Goddard Space Flight Center. The Total Ozone Mapping Spectrometer (TOMS) total ozone and the Solar Backscatter Ultraviolet (SBUV) or (SBUV/2) partial ozone profile observations are assimilated. The assimilation, into an off-line ozone transport model, is done using the global Physical-space Statistical Analysis Scheme (PSAS). This system became operational in December 1999. A detailed description of the statistical analysis scheme, and in particular, the forecast and observation error covariance models is given. A new global anisotropic horizontal forecast error correlation model accounts for a varying distribution of observations with latitude. Correlations are largest in the zonal direction in the tropics where data is sparse. Forecast error variance model is proportional to the ozone field. The forecast error covariance parameters were determined by maximum likelihood estimation. The error covariance models are validated using x squared statistics. The analyzed ozone fields in the winter 1992 are validated against independent observations from ozone sondes and HALOE. There is better than 10% agreement between mean Halogen Occultation Experiment (HALOE) and analysis fields between 70 and 0.2 hPa. The global root-mean-square (RMS) difference between TOMS observed and forecast values is less than 4%. The global RMS difference between SBUV observed and analyzed ozone between 50 and 3 hPa is less than 15%.

  10. On-line estimation of error covariance parameters for atmospheric data assimilation

    NASA Technical Reports Server (NTRS)

    Dee, Dick P.

    1995-01-01

    A simple scheme is presented for on-line estimation of covariance parameters in statistical data assimilation systems. The scheme is based on a maximum-likelihood approach in which estimates are produced on the basis of a single batch of simultaneous observations. Simple-sample covariance estimation is reasonable as long as the number of available observations exceeds the number of tunable parameters by two or three orders of magnitude. Not much is known at present about model error associated with actual forecast systems. Our scheme can be used to estimate some important statistical model error parameters such as regionally averaged variances or characteristic correlation length scales. The advantage of the single-sample approach is that it does not rely on any assumptions about the temporal behavior of the covariance parameters: time-dependent parameter estimates can be continuously adjusted on the basis of current observations. This is of practical importance since it is likely to be the case that both model error and observation error strongly depend on the actual state of the atmosphere. The single-sample estimation scheme can be incorporated into any four-dimensional statistical data assimilation system that involves explicit calculation of forecast error covariances, including optimal interpolation (OI) and the simplified Kalman filter (SKF). The computational cost of the scheme is high but not prohibitive; on-line estimation of one or two covariance parameters in each analysis box of an operational bozed-OI system is currently feasible. A number of numerical experiments performed with an adaptive SKF and an adaptive version of OI, using a linear two-dimensional shallow-water model and artificially generated model error are described. The performance of the nonadaptive versions of these methods turns out to depend rather strongly on correct specification of model error parameters. These parameters are estimated under a variety of conditions, including uniformly distributed model error and time-dependent model error statistics.

  11. Systematic review of statistical approaches to quantify, or correct for, measurement error in a continuous exposure in nutritional epidemiology.

    PubMed

    Bennett, Derrick A; Landry, Denise; Little, Julian; Minelli, Cosetta

    2017-09-19

    Several statistical approaches have been proposed to assess and correct for exposure measurement error. We aimed to provide a critical overview of the most common approaches used in nutritional epidemiology. MEDLINE, EMBASE, BIOSIS and CINAHL were searched for reports published in English up to May 2016 in order to ascertain studies that described methods aimed to quantify and/or correct for measurement error for a continuous exposure in nutritional epidemiology using a calibration study. We identified 126 studies, 43 of which described statistical methods and 83 that applied any of these methods to a real dataset. The statistical approaches in the eligible studies were grouped into: a) approaches to quantify the relationship between different dietary assessment instruments and "true intake", which were mostly based on correlation analysis and the method of triads; b) approaches to adjust point and interval estimates of diet-disease associations for measurement error, mostly based on regression calibration analysis and its extensions. Two approaches (multiple imputation and moment reconstruction) were identified that can deal with differential measurement error. For regression calibration, the most common approach to correct for measurement error used in nutritional epidemiology, it is crucial to ensure that its assumptions and requirements are fully met. Analyses that investigate the impact of departures from the classical measurement error model on regression calibration estimates can be helpful to researchers in interpreting their findings. With regard to the possible use of alternative methods when regression calibration is not appropriate, the choice of method should depend on the measurement error model assumed, the availability of suitable calibration study data and the potential for bias due to violation of the classical measurement error model assumptions. On the basis of this review, we provide some practical advice for the use of methods to assess and adjust for measurement error in nutritional epidemiology.

  12. Covariance Analysis Tool (G-CAT) for Computing Ascent, Descent, and Landing Errors

    NASA Technical Reports Server (NTRS)

    Boussalis, Dhemetrios; Bayard, David S.

    2013-01-01

    G-CAT is a covariance analysis tool that enables fast and accurate computation of error ellipses for descent, landing, ascent, and rendezvous scenarios, and quantifies knowledge error contributions needed for error budgeting purposes. Because GCAT supports hardware/system trade studies in spacecraft and mission design, it is useful in both early and late mission/ proposal phases where Monte Carlo simulation capability is not mature, Monte Carlo simulation takes too long to run, and/or there is a need to perform multiple parametric system design trades that would require an unwieldy number of Monte Carlo runs. G-CAT is formulated as a variable-order square-root linearized Kalman filter (LKF), typically using over 120 filter states. An important property of G-CAT is that it is based on a 6-DOF (degrees of freedom) formulation that completely captures the combined effects of both attitude and translation errors on the propagated trajectories. This ensures its accuracy for guidance, navigation, and control (GN&C) analysis. G-CAT provides the desired fast turnaround analysis needed for error budgeting in support of mission concept formulations, design trade studies, and proposal development efforts. The main usefulness of a covariance analysis tool such as G-CAT is its ability to calculate the performance envelope directly from a single run. This is in sharp contrast to running thousands of simulations to obtain similar information using Monte Carlo methods. It does this by propagating the "statistics" of the overall design, rather than simulating individual trajectories. G-CAT supports applications to lunar, planetary, and small body missions. It characterizes onboard knowledge propagation errors associated with inertial measurement unit (IMU) errors (gyro and accelerometer), gravity errors/dispersions (spherical harmonics, masscons), and radar errors (multiple altimeter beams, multiple Doppler velocimeter beams). G-CAT is a standalone MATLAB- based tool intended to run on any engineer's desktop computer.

  13. How allele frequency and study design affect association test statistics with misrepresentation errors.

    PubMed

    Escott-Price, Valentina; Ghodsi, Mansoureh; Schmidt, Karl Michael

    2014-04-01

    We evaluate the effect of genotyping errors on the type-I error of a general association test based on genotypes, showing that, in the presence of errors in the case and control samples, the test statistic asymptotically follows a scaled non-central $\\chi ^2$ distribution. We give explicit formulae for the scaling factor and non-centrality parameter for the symmetric allele-based genotyping error model and for additive and recessive disease models. They show how genotyping errors can lead to a significantly higher false-positive rate, growing with sample size, compared with the nominal significance levels. The strength of this effect depends very strongly on the population distribution of the genotype, with a pronounced effect in the case of rare alleles, and a great robustness against error in the case of large minor allele frequency. We also show how these results can be used to correct $p$-values.

  14. WASP (Write a Scientific Paper) using Excel - 6: Standard error and confidence interval.

    PubMed

    Grech, Victor

    2018-03-01

    The calculation of descriptive statistics includes the calculation of standard error and confidence interval, an inevitable component of data analysis in inferential statistics. This paper provides pointers as to how to do this in Microsoft Excel™. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Statistical Analysis Experiment for Freshman Chemistry Lab.

    ERIC Educational Resources Information Center

    Salzsieder, John C.

    1995-01-01

    Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…

  16. Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.

    PubMed

    Monica, Stefania; Ferrari, Gianluigi

    2018-05-17

    Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.

  17. Testing for independence in J×K contingency tables with complex sample survey data.

    PubMed

    Lipsitz, Stuart R; Fitzmaurice, Garrett M; Sinha, Debajyoti; Hevelone, Nathanael; Giovannucci, Edward; Hu, Jim C

    2015-09-01

    The test of independence of row and column variables in a (J×K) contingency table is a widely used statistical test in many areas of application. For complex survey samples, use of the standard Pearson chi-squared test is inappropriate due to correlation among units within the same cluster. Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) proposed an approach in which the standard Pearson chi-squared statistic is multiplied by a design effect to adjust for the complex survey design. Unfortunately, this test fails to exist when one of the observed cell counts equals zero. Even with the large samples typical of many complex surveys, zero cell counts can occur for rare events, small domains, or contingency tables with a large number of cells. Here, we propose Wald and score test statistics for independence based on weighted least squares estimating equations. In contrast to the Rao-Scott test statistic, the proposed Wald and score test statistics always exist. In simulations, the score test is found to perform best with respect to type I error. The proposed method is motivated by, and applied to, post surgical complications data from the United States' Nationwide Inpatient Sample (NIS) complex survey of hospitals in 2008. © 2015, The International Biometric Society.

  18. Quantifying and Generalizing Hydrologic Responses to Dam Regulation using a Statistical Modeling Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McManamay, Ryan A

    2014-01-01

    Despite the ubiquitous existence of dams within riverscapes, much of our knowledge about dams and their environmental effects remains context-specific. Hydrology, more than any other environmental variable, has been studied in great detail with regard to dam regulation. While much progress has been made in generalizing the hydrologic effects of regulation by large dams, many aspects of hydrology show site-specific fidelity to dam operations, small dams (including diversions), and regional hydrologic regimes. A statistical modeling framework is presented to quantify and generalize hydrologic responses to varying degrees of dam regulation. Specifically, the objectives were to 1) compare the effects ofmore » local versus cumulative dam regulation, 2) determine the importance of different regional hydrologic regimes in influencing hydrologic responses to dams, and 3) evaluate how different regulation contexts lead to error in predicting hydrologic responses to dams. Overall, model performance was poor in quantifying the magnitude of hydrologic responses, but performance was sufficient in classifying hydrologic responses as negative or positive. Responses of some hydrologic indices to dam regulation were highly dependent upon hydrologic class membership and the purpose of the dam. The opposing coefficients between local and cumulative-dam predictors suggested that hydrologic responses to cumulative dam regulation are complex, and predicting the hydrology downstream of individual dams, as opposed to multiple dams, may be more easy accomplished using statistical approaches. Results also suggested that particular contexts, including multipurpose dams, high cumulative regulation by multiple dams, diversions, close proximity to dams, and certain hydrologic classes are all sources of increased error when predicting hydrologic responses to dams. Statistical models, such as the ones presented herein, show promise in their ability to model the effects of dam regulation effects at large spatial scales as to generalize the directionality of hydrologic responses.« less

  19. Testing physical models for dipolar asymmetry with CMB polarization

    NASA Astrophysics Data System (ADS)

    Contreras, D.; Zibin, J. P.; Scott, D.; Banday, A. J.; Górski, K. M.

    2017-12-01

    The cosmic microwave background (CMB) temperature anisotropies exhibit a large-scale dipolar power asymmetry. To determine whether this is due to a real, physical modulation or is simply a large statistical fluctuation requires the measurement of new modes. Here we forecast how well CMB polarization data from Planck and future experiments will be able to confirm or constrain physical models for modulation. Fitting several such models to the Planck temperature data allows us to provide predictions for polarization asymmetry. While for some models and parameters Planck polarization will decrease error bars on the modulation amplitude by only a small percentage, we show, importantly, that cosmic-variance-limited (and in some cases even Planck) polarization data can decrease the errors by considerably better than the expectation of √{2 } based on simple ℓ-space arguments. We project that if the primordial fluctuations are truly modulated (with parameters as indicated by Planck temperature data) then Planck will be able to make a 2 σ detection of the modulation model with 20%-75% probability, increasing to 45%-99% when cosmic-variance-limited polarization is considered. We stress that these results are quite model dependent. Cosmic variance in temperature is important: combining statistically isotropic polarization with temperature data will spuriously increase the significance of the temperature signal with 30% probability for Planck.

  20. A constrained regularization method for inverting data represented by linear algebraic or integral equations

    NASA Astrophysics Data System (ADS)

    Provencher, Stephen W.

    1982-09-01

    CONTIN is a portable Fortran IV package for inverting noisy linear operator equations. These problems occur in the analysis of data from a wide variety experiments. They are generally ill-posed problems, which means that errors in an unregularized inversion are unbounded. Instead, CONTIN seeks the optimal solution by incorporating parsimony and any statistical prior knowledge into the regularizor and absolute prior knowledge into equallity and inequality constraints. This can be greatly increase the resolution and accuracyh of the solution. CONTIN is very flexible, consisting of a core of about 50 subprograms plus 13 small "USER" subprograms, which the user can easily modify to specify special-purpose constraints, regularizors, operator equations, simulations, statistical weighting, etc. Specjial collections of USER subprograms are available for photon correlation spectroscopy, multicomponent spectra, and Fourier-Bessel, Fourier and Laplace transforms. Numerically stable algorithms are used throughout CONTIN. A fairly precise definition of information content in terms of degrees of freedom is given. The regularization parameter can be automatically chosen on the basis of an F-test and confidence region. The interpretation of the latter and of error estimates based on the covariance matrix of the constrained regularized solution are discussed. The strategies, methods and options in CONTIN are outlined. The program itself is described in the following paper.

  1. Verification of forecast ensembles in complex terrain including observation uncertainty

    NASA Astrophysics Data System (ADS)

    Dorninger, Manfred; Kloiber, Simon

    2017-04-01

    Traditionally, verification means to verify a forecast (ensemble) with the truth represented by observations. The observation errors are quite often neglected arguing that they are small when compared to the forecast error. In this study as part of the MesoVICT (Mesoscale Verification Inter-comparison over Complex Terrain) project it will be shown, that observation errors have to be taken into account for verification purposes. The observation uncertainty is estimated from the VERA (Vienna Enhanced Resolution Analysis) and represented via two analysis ensembles which are compared to the forecast ensemble. For the whole study results from COSMO-LEPS provided by Arpae-SIMC Emilia-Romagna are used as forecast ensemble. The time period covers the MesoVICT core case from 20-22 June 2007. In a first step, all ensembles are investigated concerning their distribution. Several tests have been executed (Kolmogorov-Smirnov-Test, Finkelstein-Schafer Test, Chi-Square Test etc.) showing no exact mathematical distribution. So the main focus is on non-parametric statistics (e.g. Kernel density estimation, Boxplots etc.) and also the deviation between "forced" normal distributed data and the kernel density estimations. In a next step the observational deviations due to the analysis ensembles are analysed. In a first approach scores are multiple times calculated with every single ensemble member from the analysis ensemble regarded as "true" observation. The results are presented as boxplots for the different scores and parameters. Additionally, the bootstrapping method is also applied to the ensembles. These possible approaches to incorporating observational uncertainty into the computation of statistics will be discussed in the talk.

  2. Semiclassical Dynamicswith Exponentially Small Error Estimates

    NASA Astrophysics Data System (ADS)

    Hagedorn, George A.; Joye, Alain

    We construct approximate solutions to the time-dependent Schrödingerequation for small values of ħ. If V satisfies appropriate analyticity and growth hypotheses and , these solutions agree with exact solutions up to errors whose norms are bounded by for some C and γ>0. Under more restrictive hypotheses, we prove that for sufficiently small T', implies the norms of the errors are bounded by for some C', γ'>0, and σ > 0.

  3. Counteracting structural errors in ensemble forecast of influenza outbreaks.

    PubMed

    Pei, Sen; Shaman, Jeffrey

    2017-10-13

    For influenza forecasts generated using dynamical models, forecast inaccuracy is partly attributable to the nonlinear growth of error. As a consequence, quantification of the nonlinear error structure in current forecast models is needed so that this growth can be corrected and forecast skill improved. Here, we inspect the error growth of a compartmental influenza model and find that a robust error structure arises naturally from the nonlinear model dynamics. By counteracting these structural errors, diagnosed using error breeding, we develop a new forecast approach that combines dynamical error correction and statistical filtering techniques. In retrospective forecasts of historical influenza outbreaks for 95 US cities from 2003 to 2014, overall forecast accuracy for outbreak peak timing, peak intensity and attack rate, are substantially improved for predicted lead times up to 10 weeks. This error growth correction method can be generalized to improve the forecast accuracy of other infectious disease dynamical models.Inaccuracy of influenza forecasts based on dynamical models is partly due to nonlinear error growth. Here the authors address the error structure of a compartmental influenza model, and develop a new improved forecast approach combining dynamical error correction and statistical filtering techniques.

  4. A simulation of GPS and differential GPS sensors

    NASA Technical Reports Server (NTRS)

    Rankin, James M.

    1993-01-01

    The Global Positioning System (GPS) is a revolutionary advance in navigation. Users can determine latitude, longitude, and altitude by receiving range information from at least four satellites. The statistical accuracy of the user's position is directly proportional to the statistical accuracy of the range measurement. Range errors are caused by clock errors, ephemeris errors, atmospheric delays, multipath errors, and receiver noise. Selective Availability, which the military uses to intentionally degrade accuracy for non-authorized users, is a major error source. The proportionality constant relating position errors to range errors is the Dilution of Precision (DOP) which is a function of the satellite geometry. Receivers separated by relatively short distances have the same satellite and atmospheric errors. Differential GPS (DGPS) removes these errors by transmitting pseudorange corrections from a fixed receiver to a mobile receiver. The corrected pseudorange at the moving receiver is now corrupted only by errors from the receiver clock, multipath, and measurement noise. This paper describes a software package that models position errors for various GPS and DGPS systems. The error model is used in the Real-Time Simulator and Cockpit Technology workstation simulations at NASA-LaRC. The GPS/DGPS sensor can simulate enroute navigation, instrument approaches, or on-airport navigation.

  5. The impact of 3D volume of interest definition on accuracy and precision of activity estimation in quantitative SPECT and planar processing methods

    NASA Astrophysics Data System (ADS)

    He, Bin; Frey, Eric C.

    2010-06-01

    Accurate and precise estimation of organ activities is essential for treatment planning in targeted radionuclide therapy. We have previously evaluated the impact of processing methodology, statistical noise and variability in activity distribution and anatomy on the accuracy and precision of organ activity estimates obtained with quantitative SPECT (QSPECT) and planar (QPlanar) processing. Another important factor impacting the accuracy and precision of organ activity estimates is accuracy of and variability in the definition of organ regions of interest (ROI) or volumes of interest (VOI). The goal of this work was thus to systematically study the effects of VOI definition on the reliability of activity estimates. To this end, we performed Monte Carlo simulation studies using randomly perturbed and shifted VOIs to assess the impact on organ activity estimates. The 3D NCAT phantom was used with activities that modeled clinically observed 111In ibritumomab tiuxetan distributions. In order to study the errors resulting from misdefinitions due to manual segmentation errors, VOIs of the liver and left kidney were first manually defined. Each control point was then randomly perturbed to one of the nearest or next-nearest voxels in three ways: with no, inward or outward directional bias, resulting in random perturbation, erosion or dilation, respectively, of the VOIs. In order to study the errors resulting from the misregistration of VOIs, as would happen, e.g. in the case where the VOIs were defined using a misregistered anatomical image, the reconstructed SPECT images or projections were shifted by amounts ranging from -1 to 1 voxels in increments of with 0.1 voxels in both the transaxial and axial directions. The activity estimates from the shifted reconstructions or projections were compared to those from the originals, and average errors were computed for the QSPECT and QPlanar methods, respectively. For misregistration, errors in organ activity estimations were linear in the shift for both the QSPECT and QPlanar methods. QPlanar was less sensitive to object definition perturbations than QSPECT, especially for dilation and erosion cases. Up to 1 voxel misregistration or misdefinition resulted in up to 8% error in organ activity estimates, with the largest errors for small or low uptake organs. Both types of VOI definition errors produced larger errors in activity estimates for a small and low uptake organs (i.e. -7.5% to 5.3% for the left kidney) than for a large and high uptake organ (i.e. -2.9% to 2.1% for the liver). We observed that misregistration generally had larger effects than misdefinition, with errors ranging from -7.2% to 8.4%. The different imaging methods evaluated responded differently to the errors from misregistration and misdefinition. We found that QSPECT was more sensitive to misdefinition errors, but less sensitive to misregistration errors, as compared to the QPlanar method. Thus, sensitivity to VOI definition errors should be an important criterion in evaluating quantitative imaging methods.

  6. Syndromic surveillance for health information system failures: a feasibility study

    PubMed Central

    Ong, Mei-Sing; Magrabi, Farah; Coiera, Enrico

    2013-01-01

    Objective To explore the applicability of a syndromic surveillance method to the early detection of health information technology (HIT) system failures. Methods A syndromic surveillance system was developed to monitor a laboratory information system at a tertiary hospital. Four indices were monitored: (1) total laboratory records being created; (2) total records with missing results; (3) average serum potassium results; and (4) total duplicated tests on a patient. The goal was to detect HIT system failures causing: data loss at the record level; data loss at the field level; erroneous data; and unintended duplication of data. Time-series models of the indices were constructed, and statistical process control charts were used to detect unexpected behaviors. The ability of the models to detect HIT system failures was evaluated using simulated failures, each lasting for 24 h, with error rates ranging from 1% to 35%. Results In detecting data loss at the record level, the model achieved a sensitivity of 0.26 when the simulated error rate was 1%, while maintaining a specificity of 0.98. Detection performance improved with increasing error rates, achieving a perfect sensitivity when the error rate was 35%. In the detection of missing results, erroneous serum potassium results and unintended repetition of tests, perfect sensitivity was attained when the error rate was as small as 5%. Decreasing the error rate to 1% resulted in a drop in sensitivity to 0.65–0.85. Conclusions Syndromic surveillance methods can potentially be applied to monitor HIT systems, to facilitate the early detection of failures. PMID:23184193

  7. Previous Estimates of Mitochondrial DNA Mutation Level Variance Did Not Account for Sampling Error: Comparing the mtDNA Genetic Bottleneck in Mice and Humans

    PubMed Central

    Wonnapinij, Passorn; Chinnery, Patrick F.; Samuels, David C.

    2010-01-01

    In cases of inherited pathogenic mitochondrial DNA (mtDNA) mutations, a mother and her offspring generally have large and seemingly random differences in the amount of mutated mtDNA that they carry. Comparisons of measured mtDNA mutation level variance values have become an important issue in determining the mechanisms that cause these large random shifts in mutation level. These variance measurements have been made with samples of quite modest size, which should be a source of concern because higher-order statistics, such as variance, are poorly estimated from small sample sizes. We have developed an analysis of the standard error of variance from a sample of size n, and we have defined error bars for variance measurements based on this standard error. We calculate variance error bars for several published sets of measurements of mtDNA mutation level variance and show how the addition of the error bars alters the interpretation of these experimental results. We compare variance measurements from human clinical data and from mouse models and show that the mutation level variance is clearly higher in the human data than it is in the mouse models at both the primary oocyte and offspring stages of inheritance. We discuss how the standard error of variance can be used in the design of experiments measuring mtDNA mutation level variance. Our results show that variance measurements based on fewer than 20 measurements are generally unreliable and ideally more than 50 measurements are required to reliably compare variances with less than a 2-fold difference. PMID:20362273

  8. Does the sensorimotor system minimize prediction error or select the most likely prediction during object lifting?

    PubMed Central

    McGregor, Heather R.; Pun, Henry C. H.; Buckingham, Gavin; Gribble, Paul L.

    2016-01-01

    The human sensorimotor system is routinely capable of making accurate predictions about an object's weight, which allows for energetically efficient lifts and prevents objects from being dropped. Often, however, poor predictions arise when the weight of an object can vary and sensory cues about object weight are sparse (e.g., picking up an opaque water bottle). The question arises, what strategies does the sensorimotor system use to make weight predictions when one is dealing with an object whose weight may vary? For example, does the sensorimotor system use a strategy that minimizes prediction error (minimal squared error) or one that selects the weight that is most likely to be correct (maximum a posteriori)? In this study we dissociated the predictions of these two strategies by having participants lift an object whose weight varied according to a skewed probability distribution. We found, using a small range of weight uncertainty, that four indexes of sensorimotor prediction (grip force rate, grip force, load force rate, and load force) were consistent with a feedforward strategy that minimizes the square of prediction errors. These findings match research in the visuomotor system, suggesting parallels in underlying processes. We interpret our findings within a Bayesian framework and discuss the potential benefits of using a minimal squared error strategy. NEW & NOTEWORTHY Using a novel experimental model of object lifting, we tested whether the sensorimotor system models the weight of objects by minimizing lifting errors or by selecting the statistically most likely weight. We found that the sensorimotor system minimizes the square of prediction errors for object lifting. This parallels the results of studies that investigated visually guided reaching, suggesting an overlap in the underlying mechanisms between tasks that involve different sensory systems. PMID:27760821

  9. Improving the nowcasting of precipitation in an Alpine region with an enhanced radar echo tracking algorithm

    NASA Astrophysics Data System (ADS)

    Mecklenburg, S.; Joss, J.; Schmid, W.

    2000-12-01

    Nowcasting for hydrological applications is discussed. The tracking algorithm extrapolates radar images in space and time. It originates from the pattern recognition techniques TREC (Tracking Radar Echoes by Correlation, Rinehart and Garvey, J. Appl. Meteor., 34 (1995) 1286) and COTREC (Continuity of TREC vectors, Li et al., Nature, 273 (1978) 287). To evaluate the quality of the extrapolation, a parameter scheme is introduced, able to distinguish between errors in the position and the intensity of the predicted precipitation. The parameters for the position are the absolute error, the relative error and the error of the forecasted direction. The parameters for the intensity are the ratio of the medians and the variations of the rain rate (ratio of two quantiles) between the actual and the forecasted image. To judge the overall quality of the forecast, the correlation coefficient between the forecasted and the actual radar image has been used. To improve the forecast, three aspects have been investigated: (a) Common meteorological attributes of convective cells, derived from a hail statistics, have been determined to optimize the parameters of the tracking algorithm. Using (a), the forecast procedure modifications (b) and (c) have been applied. (b) Small-scale features have been removed by using larger tracking areas and by applying a spatial and temporal smoothing, since problems with the tracking algorithm are mainly caused by small-scale/short-term variations of the echo pattern or because of limitations caused by the radar technique itself (erroneous vectors caused by clutter or shielding). (c) The searching area and the number of searched boxes have been restricted. This limits false detections, which is especially useful in stratiform precipitation and for stationary echoes. Whereas a larger scale and the removal of small-scale features improve the forecasted position for the convective precipitation, the forecast of the stratiform event is not influenced, but limiting the search area leads to a slightly better forecast. The forecast of the intensity is successful for both precipitation events. Forecasting the variation of the rain rate calls for further investigation. Applying COTREC improves the forecast of the convective precipitation, especially for extrapolation times exceeding 30 min.

  10. The deficit of joint position sense in the chronic unstable ankle as measured by inversion angle replication error.

    PubMed

    Nakasa, Tomoyuki; Fukuhara, Kohei; Adachi, Nobuo; Ochi, Mitsuo

    2008-05-01

    Functional instability is defined as a repeated ankle inversion sprain and a giving way sensation. Previous studies have described the damage of sensori-motor control in ankle sprain as being a possible cause of functional instability. The aim of this study was to evaluate the inversion angle replication errors in patients with functional instability after ankle sprain. The difference between the index angle and replication angle was measured in 12 subjects with functional instability, with the aim of evaluating the replication error. As a control group, the replication errors of 17 healthy volunteers were investigated. The side-to-side differences of the replication errors were compared between both the groups, and the relationship between the side-to-side differences of the replication errors and the mechanical instability were statistically analyzed in the unstable group. The side-to-side difference of the replication errors was 1.0 +/- 0.7 degrees in the unstable group and 0.2 +/- 0.7 degrees in the control group. There was a statistically significant difference between both the groups. The side-to-side differences of the replication errors in the unstable group did not statistically correlate to the anterior talar translation and talar tilt. The patients with functional instability had the deficit of joint position sense in comparison with healthy volunteers. The replication error did not correlate to the mechanical instability. The patients with functional instability should be treated appropriately in spite of having less mechanical instability.

  11. Verification of unfold error estimates in the unfold operator code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fehl, D.L.; Biggs, F.

    Spectral unfolding is an inverse mathematical operation that attempts to obtain spectral source information from a set of response functions and data measurements. Several unfold algorithms have appeared over the past 30 years; among them is the unfold operator (UFO) code written at Sandia National Laboratories. In addition to an unfolded spectrum, the UFO code also estimates the unfold uncertainty (error) induced by estimated random uncertainties in the data. In UFO the unfold uncertainty is obtained from the error matrix. This built-in estimate has now been compared to error estimates obtained by running the code in a Monte Carlo fashionmore » with prescribed data distributions (Gaussian deviates). In the test problem studied, data were simulated from an arbitrarily chosen blackbody spectrum (10 keV) and a set of overlapping response functions. The data were assumed to have an imprecision of 5{percent} (standard deviation). One hundred random data sets were generated. The built-in estimate of unfold uncertainty agreed with the Monte Carlo estimate to within the statistical resolution of this relatively small sample size (95{percent} confidence level). A possible 10{percent} bias between the two methods was unresolved. The Monte Carlo technique is also useful in underdetermined problems, for which the error matrix method does not apply. UFO has been applied to the diagnosis of low energy x rays emitted by Z-pinch and ion-beam driven hohlraums. {copyright} {ital 1997 American Institute of Physics.}« less

  12. New Approaches to Quantifying Transport Model Error in Atmospheric CO2 Simulations

    NASA Technical Reports Server (NTRS)

    Ott, L.; Pawson, S.; Zhu, Z.; Nielsen, J. E.; Collatz, G. J.; Gregg, W. W.

    2012-01-01

    In recent years, much progress has been made in observing CO2 distributions from space. However, the use of these observations to infer source/sink distributions in inversion studies continues to be complicated by difficulty in quantifying atmospheric transport model errors. We will present results from several different experiments designed to quantify different aspects of transport error using the Goddard Earth Observing System, Version 5 (GEOS-5) Atmospheric General Circulation Model (AGCM). In the first set of experiments, an ensemble of simulations is constructed using perturbations to parameters in the model s moist physics and turbulence parameterizations that control sub-grid scale transport of trace gases. Analysis of the ensemble spread and scales of temporal and spatial variability among the simulations allows insight into how parameterized, small-scale transport processes influence simulated CO2 distributions. In the second set of experiments, atmospheric tracers representing model error are constructed using observation minus analysis statistics from NASA's Modern-Era Retrospective Analysis for Research and Applications (MERRA). The goal of these simulations is to understand how errors in large scale dynamics are distributed, and how they propagate in space and time, affecting trace gas distributions. These simulations will also be compared to results from NASA's Carbon Monitoring System Flux Pilot Project that quantified the impact of uncertainty in satellite constrained CO2 flux estimates on atmospheric mixing ratios to assess the major factors governing uncertainty in global and regional trace gas distributions.

  13. The Effects of Measurement Error on Statistical Models for Analyzing Change. Final Report.

    ERIC Educational Resources Information Center

    Dunivant, Noel

    The results of six major projects are discussed including a comprehensive mathematical and statistical analysis of the problems caused by errors of measurement in linear models for assessing change. In a general matrix representation of the problem, several new analytic results are proved concerning the parameters which affect bias in…

  14. Student Distractor Choices on the Mathematics Virginia Standards of Learning Middle School Assessments

    ERIC Educational Resources Information Center

    Lewis, Virginia Vimpeny

    2011-01-01

    Number Concepts; Measurement; Geometry; Probability; Statistics; and Patterns, Functions and Algebra. Procedural Errors were further categorized into the following content categories: Computation; Measurement; Statistics; and Patterns, Functions, and Algebra. The results of the analysis showed the main sources of error for 6th, 7th, and 8th…

  15. Sampling procedures for throughfall monitoring: A simulation study

    NASA Astrophysics Data System (ADS)

    Zimmermann, Beate; Zimmermann, Alexander; Lark, Richard Murray; Elsenbeer, Helmut

    2010-01-01

    What is the most appropriate sampling scheme to estimate event-based average throughfall? A satisfactory answer to this seemingly simple question has yet to be found, a failure which we attribute to previous efforts' dependence on empirical studies. Here we try to answer this question by simulating stochastic throughfall fields based on parameters for statistical models of large monitoring data sets. We subsequently sampled these fields with different sampling designs and variable sample supports. We evaluated the performance of a particular sampling scheme with respect to the uncertainty of possible estimated means of throughfall volumes. Even for a relative error limit of 20%, an impractically large number of small, funnel-type collectors would be required to estimate mean throughfall, particularly for small events. While stratification of the target area is not superior to simple random sampling, cluster random sampling involves the risk of being less efficient. A larger sample support, e.g., the use of trough-type collectors, considerably reduces the necessary sample sizes and eliminates the sensitivity of the mean to outliers. Since the gain in time associated with the manual handling of troughs versus funnels depends on the local precipitation regime, the employment of automatically recording clusters of long troughs emerges as the most promising sampling scheme. Even so, a relative error of less than 5% appears out of reach for throughfall under heterogeneous canopies. We therefore suspect a considerable uncertainty of input parameters for interception models derived from measured throughfall, in particular, for those requiring data of small throughfall events.

  16. Identification of Intensity Ratio Break Points from Photon Arrival Trajectories in Ratiometric Single Molecule Spectroscopy

    PubMed Central

    Bingemann, Dieter; Allen, Rachel M.

    2012-01-01

    We describe a statistical method to analyze dual-channel photon arrival trajectories from single molecule spectroscopy model-free to identify break points in the intensity ratio. Photons are binned with a short bin size to calculate the logarithm of the intensity ratio for each bin. Stochastic photon counting noise leads to a near-normal distribution of this logarithm and the standard student t-test is used to find statistically significant changes in this quantity. In stochastic simulations we determine the significance threshold for the t-test’s p-value at a given level of confidence. We test the method’s sensitivity and accuracy indicating that the analysis reliably locates break points with significant changes in the intensity ratio with little or no error in realistic trajectories with large numbers of small change points, while still identifying a large fraction of the frequent break points with small intensity changes. Based on these results we present an approach to estimate confidence intervals for the identified break point locations and recommend a bin size to choose for the analysis. The method proves powerful and reliable in the analysis of simulated and actual data of single molecule reorientation in a glassy matrix. PMID:22837704

  17. Verification of unfold error estimates in the UFO code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fehl, D.L.; Biggs, F.

    Spectral unfolding is an inverse mathematical operation which attempts to obtain spectral source information from a set of tabulated response functions and data measurements. Several unfold algorithms have appeared over the past 30 years; among them is the UFO (UnFold Operator) code. In addition to an unfolded spectrum, UFO also estimates the unfold uncertainty (error) induced by running the code in a Monte Carlo fashion with prescribed data distributions (Gaussian deviates). In the problem studied, data were simulated from an arbitrarily chosen blackbody spectrum (10 keV) and a set of overlapping response functions. The data were assumed to have anmore » imprecision of 5% (standard deviation). 100 random data sets were generated. The built-in estimate of unfold uncertainty agreed with the Monte Carlo estimate to within the statistical resolution of this relatively small sample size (95% confidence level). A possible 10% bias between the two methods was unresolved. The Monte Carlo technique is also useful in underdetemined problems, for which the error matrix method does not apply. UFO has been applied to the diagnosis of low energy x rays emitted by Z-Pinch and ion-beam driven hohlraums.« less

  18. Optimization of Control Points Number at Coordinate Measurements based on the Monte-Carlo Method

    NASA Astrophysics Data System (ADS)

    Korolev, A. A.; Kochetkov, A. V.; Zakharov, O. V.

    2018-01-01

    Improving the quality of products causes an increase in the requirements for the accuracy of the dimensions and shape of the surfaces of the workpieces. This, in turn, raises the requirements for accuracy and productivity of measuring of the workpieces. The use of coordinate measuring machines is currently the most effective measuring tool for solving similar problems. The article proposes a method for optimizing the number of control points using Monte Carlo simulation. Based on the measurement of a small sample from batches of workpieces, statistical modeling is performed, which allows one to obtain interval estimates of the measurement error. This approach is demonstrated by examples of applications for flatness, cylindricity and sphericity. Four options of uniform and uneven arrangement of control points are considered and their comparison is given. It is revealed that when the number of control points decreases, the arithmetic mean decreases, the standard deviation of the measurement error increases and the probability of the measurement α-error increases. In general, it has been established that it is possible to repeatedly reduce the number of control points while maintaining the required measurement accuracy.

  19. A method to estimate the effect of deformable image registration uncertainties on daily dose mapping

    PubMed Central

    Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin

    2012-01-01

    Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766

  20. The intercrater plains of Mercury and the Moon: Their nature, origin and role in terrestrial planet evolution. Measurement and errors of crater statistics. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Leake, M. A.

    1982-01-01

    Planetary imagery techniques, errors in measurement or degradation assignment, and statistical formulas are presented with respect to cratering data. Base map photograph preparation, measurement of crater diameters and sampled area, and instruments used are discussed. Possible uncertainties, such as Sun angle, scale factors, degradation classification, and biases in crater recognition are discussed. The mathematical formulas used in crater statistics are presented.

  1. Visual Survey of Infantry Troops. Part 1. Visual Acuity, Refractive Status, Interpupillary Distance and Visual Skills

    DTIC Science & Technology

    1989-06-01

    letters on one line and several letters on the next line, there is no accurate way to credit these extra letters for statistical analysis. The decimal and...contains the descriptive statistics of the objective refractive error components of infantrymen. Figures 8-11 show the frequency distributions for sphere...equivalents. Nonspectacle wearers Table 12 contains the idescriptive statistics for non- spectacle wearers. Based or these refractive error data, about 30

  2. Sample Size and Statistical Conclusions from Tests of Fit to the Rasch Model According to the Rasch Unidimensional Measurement Model (Rumm) Program in Health Outcome Measurement.

    PubMed

    Hagell, Peter; Westergren, Albert

    Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).

  3. Data Analysis & Statistical Methods for Command File Errors

    NASA Technical Reports Server (NTRS)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  4. Distinguishing models of reionization using future radio observations of 21-cm 1-point statistics

    NASA Astrophysics Data System (ADS)

    Watkinson, C. A.; Pritchard, J. R.

    2014-10-01

    We explore the impact of reionization topology on 21-cm statistics. Four reionization models are presented which emulate large ionized bubbles around overdense regions (21CMFAST/global-inside-out), small ionized bubbles in overdense regions (local-inside-out), large ionized bubbles around underdense regions (global-outside-in) and small ionized bubbles around underdense regions (local-outside-in). We show that first generation instruments might struggle to distinguish global models using the shape of the power spectrum alone. All instruments considered are capable of breaking this degeneracy with the variance, which is higher in outside-in models. Global models can also be distinguished at small scales from a boost in the power spectrum from a positive correlation between the density and neutral-fraction fields in outside-in models. Negative skewness is found to be unique to inside-out models and we find that pre-Square Kilometre Array (SKA) instruments could detect this feature in maps smoothed to reduce noise errors. The early, mid- and late phases of reionization imprint signatures in the brightness-temperature moments, we examine their model dependence and find pre-SKA instruments capable of exploiting these timing constraints in smoothed maps. The dimensional skewness is introduced and is shown to have stronger signatures of the early and mid-phase timing if the inside-out scenario is correct.

  5. Statistical analysis of the limitation of half integer resonances on the available momentum acceptance of the High Energy Photon Source

    NASA Astrophysics Data System (ADS)

    Jiao, Yi; Duan, Zhe

    2017-01-01

    In a diffraction-limited storage ring, half integer resonances can have strong effects on the beam dynamics, associated with the large detuning terms from the strong focusing and strong sextupoles as required for an ultralow emittance. In this study, the limitation of half integer resonances on the available momentum acceptance (MA) was statistically analyzed based on one design of the High Energy Photon Source (HEPS). It was found that the probability of MA reduction due to crossing of half integer resonances is closely correlated with the level of beta beats at the nominal tunes, but independent of the error sources. The analysis indicated that for the presented HEPS lattice design, the rms amplitude of beta beats should be kept below 1.5% horizontally and 2.5% vertically to reach a small MA reduction probability of about 1%.

  6. Close Companions to Nearby Young Stars from Adaptive Optics Imaging on VLT and Keck

    NASA Astrophysics Data System (ADS)

    Haisch, Karl E.; Jayawardhana, Ray; Brandeker, Alexis; Mardones, Diego

    We report the results of VLT and Keck adaptive optics surveys of known members of the η Chamaeleontis, MBM 12, and TW Hydrae (TWA) associations to search for close companions. The multiplicity statistics of η Cha, MBM 12, and TWA are quite high compared with other clusters and associations, although our errors are large due to small number statistics. We have resolved S18 in MBM 12 and RECX 9 in η Cha into triples for the first time. The tight binary TWA 5Aab in the TWA offers the prospect of measuring the dynamical masses of both components as well as an independent distance to the system within a few years. The AO detection of the close companion to the nearby young star χ1 Orionis, previously inferred from radial velocity and astrometric observations, has already made it possible to derive the dynamical masses of that system without any astrophysical assumption.

  7. Usefulness of Dismissing and Changing the Coach in Professional Soccer

    PubMed Central

    Heuer, Andreas; Müller, Christian; Rubner, Oliver; Hagemann, Norbert; Strauss, Bernd

    2011-01-01

    Whether a coach dismissal during the mid-season has an impact on the subsequent team performance has long been a subject of controversial scientific discussion. Here we find a clear-cut answer to this question by using a recently developed statistical framework for the team fitness and by analyzing the first two moments of the effect of a coach dismissal. We can show with an unprecedented small statistical error for the German soccer league that dismissing the coach within the season has basically no effect on the subsequent performance of a team. Changing the coach between two seasons has no effect either. Furthermore, an upper bound for the actual influence of the coach on the team fitness can be estimated. Beyond the immediate relevance of this result, this study may lead the way to analogous studies for exploring the effect of managerial changes, e.g., in economic terms. PMID:21445335

  8. Commissioning of the NPDGamma Detector Array: Counting Statistics in Current Mode Operation and Parity Violation in the Capture of Cold Neutrons on B 4 C and (27) Al.

    PubMed

    Gericke, M T; Bowman, J D; Carlini, R D; Chupp, T E; Coulter, K P; Dabaghyan, M; Desai, D; Freedman, S J; Gentile, T R; Gillis, R C; Greene, G L; Hersman, F W; Ino, T; Ishimoto, S; Jones, G L; Lauss, B; Leuschner, M B; Losowski, B; Mahurin, R; Masuda, Y; Mitchell, G S; Muto, S; Nann, H; Page, S A; Penttila, S I; Ramsay, W D; Santra, S; Seo, P-N; Sharapov, E I; Smith, T B; Snow, W M; Wilburn, W S; Yuan, V; Zhu, H

    2005-01-01

    The NPDGamma γ-ray detector has been built to measure, with high accuracy, the size of the small parity-violating asymmetry in the angular distribution of gamma rays from the capture of polarized cold neutrons by protons. The high cold neutron flux at the Los Alamos Neutron Scattering Center (LANSCE) spallation neutron source and control of systematic errors require the use of current mode detection with vacuum photodiodes and low-noise solid-state preamplifiers. We show that the detector array operates at counting statistics and that the asymmetries due to B4C and (27)Al are zero to with- in 2 × 10(-6) and 7 × 10(-7), respectively. Boron and aluminum are used throughout the experiment. The results presented here are preliminary.

  9. NASA GPM GV Science Requirements

    NASA Technical Reports Server (NTRS)

    Smith, E.

    2003-01-01

    An important scientific objective of the NASA portion of the GPM Mission is to generate quantitatively-based error characterization information along with the rainrate retrievals emanating from the GPM constellation of satellites. These data must serve four main purposes: (1) they must be of sufficient quality, uniformity, and timeliness to govern the observation weighting schemes used in the data assimilation modules of numerical weather prediction models; (2) they must extend over that portion of the globe accessible by the GPM core satellite to which the NASA GV program is focused - (approx.65 degree inclination); (3) they must have sufficient specificity to enable detection of physically-formulated microphysical and meteorological weaknesses in the standard physical level 2 rainrate algorithms to be used in the GPM Precipitation Processing System (PPS), i.e., algorithms which will have evolved from the TRMM standard physical level 2 algorithms; and (4) they must support the use of physical error modeling as a primary validation tool and as the eventual replacement of the conventional GV approach of statistically intercomparing surface rainrates fiom ground and satellite measurements. This approach to ground validation research represents a paradigm shift vis-&-vis the program developed for the TRMM mission, which conducted ground validation largely as a statistical intercomparison process between raingauge-derived or radar-derived rainrates and the TRMM satellite rainrate retrievals -- long after the original satellite retrievals were archived. This approach has been able to quantify averaged rainrate differences between the satellite algorithms and the ground instruments, but has not been able to explain causes of algorithm failures or produce error information directly compatible with the cost functions of data assimilation schemes. These schemes require periodic and near-realtime bias uncertainty (i.e., global space-time distributed conditional accuracy of the retrieved rainrates) and local error covariance structure (i.e., global space-time distributed error correlation information for the local 4-dimensional space-time domain -- or in simpler terms, the matrix form of precision error). This can only be accomplished by establishing a network of high quality-heavily instrumented supersites selectively distributed at a few oceanic, continental, and coastal sites. Economics and pragmatics dictate that the network must be made up of a relatively small number of sites (6-8) created through international cooperation. This presentation will address some of the details of the methodology behind the error characterization approach, some proposed solutions for expanding site-developed error properties to regional scales, a data processing and communications concept that would enable rapid implementation of algorithm improvement by the algorithm developers, and the likely available options for developing the supersite network.

  10. Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results

    PubMed Central

    Wicherts, Jelte M.; Bakker, Marjan; Molenaar, Dylan

    2011-01-01

    Background The widespread reluctance to share published research data is often hypothesized to be due to the authors' fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically. Methods and Findings We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance. Conclusions Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies. PMID:22073203

  11. Simulation of an automatically-controlled STOL aircraft in a microwave landing system multipath environment

    NASA Technical Reports Server (NTRS)

    Toda, M.; Brown, S. C.; Burrous, C. N.

    1976-01-01

    The simulated response is described of a STOL aircraft to Microwave Landing System (MLS) multipath errors during final approach and touchdown. The MLS azimuth, elevation, and DME multipath errors were computed for a relatively severe multipath environment at Crissy Field California, utilizing an MLS multipath simulation at MIT Lincoln Laboratory. A NASA/Ames six-degree-of-freedom simulation of an automatically-controlled deHavilland C-8A STOL aircraft was used to determine the response to these errors. The results show that the aircraft response to all of the Crissy Field MLS multipath errors was small. The small MLS azimuth and elevation multipath errors did not result in any discernible aircraft motion, and the aircraft response to the relatively large (200-ft (61-m) peak) DME multipath was noticeable but small.

  12. Trends in statistical methods in articles published in Archives of Plastic Surgery between 2012 and 2017.

    PubMed

    Han, Kyunghwa; Jung, Inkyung

    2018-05-01

    This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.

  13. First Year Wilkinson Microwave Anisotropy Probe(WMAP) Observations: Data Processing Methods and Systematic Errors Limits

    NASA Technical Reports Server (NTRS)

    Hinshaw, G.; Barnes, C.; Bennett, C. L.; Greason, M. R.; Halpern, M.; Hill, R. S.; Jarosik, N.; Kogut, A.; Limon, M.; Meyer, S. S.

    2003-01-01

    We describe the calibration and data processing methods used to generate full-sky maps of the cosmic microwave background (CMB) from the first year of Wilkinson Microwave Anisotropy Probe (WMAP) observations. Detailed limits on residual systematic errors are assigned based largely on analyses of the flight data supplemented, where necessary, with results from ground tests. The data are calibrated in flight using the dipole modulation of the CMB due to the observatory's motion around the Sun. This constitutes a full-beam calibration source. An iterative algorithm simultaneously fits the time-ordered data to obtain calibration parameters and pixelized sky map temperatures. The noise properties are determined by analyzing the time-ordered data with this sky signal estimate subtracted. Based on this, we apply a pre-whitening filter to the time-ordered data to remove a low level of l/f noise. We infer and correct for a small (approx. 1 %) transmission imbalance between the two sky inputs to each differential radiometer, and we subtract a small sidelobe correction from the 23 GHz (K band) map prior to further analysis. No other systematic error corrections are applied to the data. Calibration and baseline artifacts, including the response to environmental perturbations, are negligible. Systematic uncertainties are comparable to statistical uncertainties in the characterization of the beam response. Both are accounted for in the covariance matrix of the window function and are propagated to uncertainties in the final power spectrum. We characterize the combined upper limits to residual systematic uncertainties through the pixel covariance matrix.

  14. A Comparison of Forecast Error Generators for Modeling Wind and Load Uncertainty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Ning; Diao, Ruisheng; Hafen, Ryan P.

    2013-07-25

    This paper presents four algorithms to generate random forecast error time series. The performance of four algorithms is compared. The error time series are used to create real-time (RT), hour-ahead (HA), and day-ahead (DA) wind and load forecast time series that statistically match historically observed forecasting data sets used in power grid operation to study the net load balancing need in variable generation integration studies. The four algorithms are truncated-normal distribution models, state-space based Markov models, seasonal autoregressive moving average (ARMA) models, and a stochastic-optimization based approach. The comparison is made using historical DA load forecast and actual load valuesmore » to generate new sets of DA forecasts with similar stoical forecast error characteristics (i.e., mean, standard deviation, autocorrelation, and cross-correlation). The results show that all methods generate satisfactory results. One method may preserve one or two required statistical characteristics better the other methods, but may not preserve other statistical characteristics as well compared with the other methods. Because the wind and load forecast error generators are used in wind integration studies to produce wind and load forecasts time series for stochastic planning processes, it is sometimes critical to use multiple methods to generate the error time series to obtain a statistically robust result. Therefore, this paper discusses and compares the capabilities of each algorithm to preserve the characteristics of the historical forecast data sets.« less

  15. A comparison of different statistical methods analyzing hypoglycemia data using bootstrap simulations.

    PubMed

    Jiang, Honghua; Ni, Xiao; Huster, William; Heilmann, Cory

    2015-01-01

    Hypoglycemia has long been recognized as a major barrier to achieving normoglycemia with intensive diabetic therapies. It is a common safety concern for the diabetes patients. Therefore, it is important to apply appropriate statistical methods when analyzing hypoglycemia data. Here, we carried out bootstrap simulations to investigate the performance of the four commonly used statistical models (Poisson, negative binomial, analysis of covariance [ANCOVA], and rank ANCOVA) based on the data from a diabetes clinical trial. Zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model were also evaluated. Simulation results showed that Poisson model inflated type I error, while negative binomial model was overly conservative. However, after adjusting for dispersion, both Poisson and negative binomial models yielded slightly inflated type I errors, which were close to the nominal level and reasonable power. Reasonable control of type I error was associated with ANCOVA model. Rank ANCOVA model was associated with the greatest power and with reasonable control of type I error. Inflated type I error was observed with ZIP and ZINB models.

  16. Water quality management using statistical analysis and time-series prediction model

    NASA Astrophysics Data System (ADS)

    Parmar, Kulwinder Singh; Bhardwaj, Rashmi

    2014-12-01

    This paper deals with water quality management using statistical analysis and time-series prediction model. The monthly variation of water quality standards has been used to compare statistical mean, median, mode, standard deviation, kurtosis, skewness, coefficient of variation at Yamuna River. Model validated using R-squared, root mean square error, mean absolute percentage error, maximum absolute percentage error, mean absolute error, maximum absolute error, normalized Bayesian information criterion, Ljung-Box analysis, predicted value and confidence limits. Using auto regressive integrated moving average model, future water quality parameters values have been estimated. It is observed that predictive model is useful at 95 % confidence limits and curve is platykurtic for potential of hydrogen (pH), free ammonia, total Kjeldahl nitrogen, dissolved oxygen, water temperature (WT); leptokurtic for chemical oxygen demand, biochemical oxygen demand. Also, it is observed that predicted series is close to the original series which provides a perfect fit. All parameters except pH and WT cross the prescribed limits of the World Health Organization /United States Environmental Protection Agency, and thus water is not fit for drinking, agriculture and industrial use.

  17. Use of Whatman-41 filters in air quality sampling networks (with applications to elemental analysis)

    NASA Technical Reports Server (NTRS)

    Neustadter, H. E.; Sidik, S. M.; King, R. B.; Fordyce, J. S.; Burr, J. C.

    1974-01-01

    The operation of a 16-site parallel high volume air sampling network with glass fiber filters on one unit and Whatman-41 filters on the other is reported. The network data and data from several other experiments indicate that (1) Sampler-to-sampler and filter-to-filter variabilities are small; (2) hygroscopic affinity of Whatman-41 filters need not introduce errors; and (3) suspended particulate samples from glass fiber filters averaged slightly, but not statistically significantly, higher than from Whatman-41-filters. The results obtained demonstrate the practicability of Whatman-41 filters for air quality monitoring and elemental analysis.

  18. The Chiral Separation Effect in quenched finite-density QCD

    NASA Astrophysics Data System (ADS)

    Puhr, Matthias; Buividovich, Pavel

    2018-03-01

    We present results of a study of the Chiral Separation Effect (CSE) in quenched finite-density QCD. Using a recently developed numerical method we calculate the conserved axial current for exactly chiral overlap fermions at finite density for the first time. We compute the anomalous transport coeffcient for the CSE in the confining and deconfining phase and investigate possible deviations from the universal value. In both phases we find that non-perturbative corrections to the CSE are absent and we reproduce the universal value for the transport coeffcient within small statistical errors. Our results suggest that the CSE can be used to determine the renormalisation factor of the axial current.

  19. Topological charge and cooling scales in pure SU(2) lattice gauge theory

    NASA Astrophysics Data System (ADS)

    Berg, Bernd A.; Clarke, David A.

    2018-03-01

    Using Monte Carlo simulations with overrelaxation, we have equilibrated lattices up to β =2.928 , size 6 04, for pure SU(2) lattice gauge theory with the Wilson action. We calculate topological charges with the standard cooling method and find that they become more reliable with increasing β values and lattice sizes. Continuum limit estimates of the topological susceptibility χ are obtained of which we favor χ1 /4/Tc=0.643 (12 ) , where Tc is the SU(2) deconfinement temperature. Differences between cooling length scales in different topological sectors turn out to be too small to be detectable within our statistical errors.

  20. Alignment issues, correlation techniques and their assessment for a visible light imaging-based 3D printer quality control system

    NASA Astrophysics Data System (ADS)

    Straub, Jeremy

    2016-05-01

    Quality control is critical to manufacturing. Frequently, techniques are used to define object conformity bounds, based on historical quality data. This paper considers techniques for bespoke and small batch jobs that are not statistical model based. These techniques also serve jobs where 100% validation is needed due to the mission or safety critical nature of particular parts. One issue with this type of system is alignment discrepancies between the generated model and the physical part. This paper discusses and evaluates techniques for characterizing and correcting alignment issues between the projected and perceived data sets to prevent errors attributable to misalignment.

  1. η and η' mesons from lattice QCD.

    PubMed

    Christ, N H; Dawson, C; Izubuchi, T; Jung, C; Liu, Q; Mawhinney, R D; Sachrajda, C T; Soni, A; Zhou, R

    2010-12-10

    The large mass of the ninth pseudoscalar meson, the η', is believed to arise from the combined effects of the axial anomaly and the gauge field topology present in QCD. We report a realistic, 2+1-flavor, lattice QCD calculation of the η and η' masses and mixing which confirms this picture. The physical eigenstates show small octet-singlet mixing with a mixing angle of θ=-14.1(2.8)°. Extrapolation to the physical light quark mass gives, with statistical errors only, mη=573(6) MeV and mη'=947(142) MeV, consistent with the experimental values of 548 and 958 MeV.

  2. The Distribution of the Orbits in the Geminid Meteoroid Stream Based on the Dispersion of their Periods

    NASA Technical Reports Server (NTRS)

    Hajdukova, M., Jr.

    2011-01-01

    Geminid meteoroids, selected from a large set of precisely-reduced meteor orbits from the photographic and radar catalogues of the IAU Meteor Data Center (Lindblad et al. 2003), and from the Japanese TV meteor shower catalogue (SonotaCo 2010), have been analyzed with the aim of determining the orbits distribution in the stream, based on the dispersion of their periods P . The values of the reciprocal semi-major axis 1/a in the stream showed small errors in the velocity measurements. Thus, it was statistically possible to also determine the relation between the observed and the real dispersion of the Geminids.

  3. Statistical model for speckle pattern optimization.

    PubMed

    Su, Yong; Zhang, Qingchuan; Gao, Zeren

    2017-11-27

    Image registration is the key technique of optical metrologies such as digital image correlation (DIC), particle image velocimetry (PIV), and speckle metrology. Its performance depends critically on the quality of image pattern, and thus pattern optimization attracts extensive attention. In this article, a statistical model is built to optimize speckle patterns that are composed of randomly positioned speckles. It is found that the process of speckle pattern generation is essentially a filtered Poisson process. The dependence of measurement errors (including systematic errors, random errors, and overall errors) upon speckle pattern generation parameters is characterized analytically. By minimizing the errors, formulas of the optimal speckle radius are presented. Although the primary motivation is from the field of DIC, we believed that scholars in other optical measurement communities, such as PIV and speckle metrology, will benefit from these discussions.

  4. Hypothesis-Testing Demands Trustworthy Data—A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy

    PubMed Central

    Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank

    2018-01-01

    In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363

  5. Hypothesis-Testing Demands Trustworthy Data-A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy.

    PubMed

    Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank

    2018-01-01

    In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.

  6. Multiple statistical tests: Lessons from a d20.

    PubMed

    Madan, Christopher R

    2016-01-01

    Statistical analyses are often conducted with α= .05. When multiple statistical tests are conducted, this procedure needs to be adjusted to compensate for the otherwise inflated Type I error. In some instances in tabletop gaming, sometimes it is desired to roll a 20-sided die (or 'd20') twice and take the greater outcome. Here I draw from probability theory and the case of a d20, where the probability of obtaining any specific outcome is (1)/ 20, to determine the probability of obtaining a specific outcome (Type-I error) at least once across repeated, independent statistical tests.

  7. State estimation for autopilot control of small unmanned aerial vehicles in windy conditions

    NASA Astrophysics Data System (ADS)

    Poorman, David Paul

    The use of small unmanned aerial vehicles (UAVs) both in the military and civil realms is growing. This is largely due to the proliferation of inexpensive sensors and the increase in capability of small computers that has stemmed from the personal electronic device market. Methods for performing accurate state estimation for large scale aircraft have been well known and understood for decades, which usually involve a complex array of expensive high accuracy sensors. Performing accurate state estimation for small unmanned aircraft is a newer area of study and often involves adapting known state estimation methods to small UAVs. State estimation for small UAVs can be more difficult than state estimation for larger UAVs due to small UAVs employing limited sensor suites due to cost, and the fact that small UAVs are more susceptible to wind than large aircraft. The purpose of this research is to evaluate the ability of existing methods of state estimation for small UAVs to accurately capture the states of the aircraft that are necessary for autopilot control of the aircraft in a Dryden wind field. The research begins by showing which aircraft states are necessary for autopilot control in Dryden wind. Then two state estimation methods that employ only accelerometer, gyro, and GPS measurements are introduced. The first method uses assumptions on aircraft motion to directly solve for attitude information and smooth GPS data, while the second method integrates sensor data to propagate estimates between GPS measurements and then corrects those estimates with GPS information. The performance of both methods is analyzed with and without Dryden wind, in straight and level flight, in a coordinated turn, and in a wings level ascent. It is shown that in zero wind, the first method produces significant steady state attitude errors in both a coordinated turn and in a wings level ascent. In Dryden wind, it produces large noise on the estimates for its attitude states, and has a non-zero mean error that increases when gyro bias is increased. The second method is shown to not exhibit any steady state error in the tested scenarios that is inherent to its design. The second method can correct for attitude errors that arise from both integration error and gyro bias states, but it suffers from lack of attitude error observability. The attitude errors are shown to be more observable in wind, but increased integration error in wind outweighs the increase in attitude corrections that such increased observability brings, resulting in larger attitude errors in wind. Overall, this work highlights many technical deficiencies of both of these methods of state estimation that could be improved upon in the future to enhance state estimation for small UAVs in windy conditions.

  8. Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores generated from the MMPI-2 and MMPI-2-RF test booklets: internal structure comparability in a sample of criminal defendants.

    PubMed

    Tarescavage, Anthony M; Alosco, Michael L; Ben-Porath, Yossef S; Wood, Arcangela; Luna-Jones, Lynn

    2015-04-01

    We investigated the internal structure comparability of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores derived from the MMPI-2 and MMPI-2-RF booklets in a sample of 320 criminal defendants (229 males and 54 females). After exclusion of invalid protocols, the final sample consisted of 96 defendants who were administered the MMPI-2-RF booklet and 83 who completed the MMPI-2. No statistically significant differences in MMPI-2-RF invalidity rates were observed between the two forms. Individuals in the final sample who completed the MMPI-2-RF did not statistically differ on demographics or referral question from those who were administered the MMPI-2 booklet. Independent t tests showed no statistically significant differences between MMPI-2-RF scores generated with the MMPI-2 and MMPI-2-RF booklets on the test's substantive scales. Statistically significant small differences were observed on the revised Variable Response Inconsistency (VRIN-r) and True Response Inconsistency (TRIN-r) scales. Cronbach's alpha and standard errors of measurement were approximately equal between the booklets for all MMPI-2-RF scales. Finally, MMPI-2-RF intercorrelations produced from the two forms yielded mostly small and a few medium differences, indicating that discriminant validity and test structure are maintained. Overall, our findings reflect the internal structure comparability of MMPI-2-RF scale scores generated from MMPI-2 and MMPI-2-RF booklets. Implications of these results and limitations of these findings are discussed. © The Author(s) 2014.

  9. Fermi-Pasta-Ulam-Tsingou problems: Passage from Boltzmann to q-statistics

    NASA Astrophysics Data System (ADS)

    Bagchi, Debarshee; Tsallis, Constantino

    2018-02-01

    The Fermi-Pasta-Ulam (FPU) one-dimensional Hamiltonian includes a quartic term which guarantees ergodicity of the system in the thermodynamic limit. Consistently, the Boltzmann factor P(ε) ∼e-βε describes its equilibrium distribution of one-body energies, and its velocity distribution is Maxwellian, i.e., P(v) ∼e - βv2 /2. We consider here a generalized system where the quartic coupling constant between sites decays as 1 / dijα (α ≥ 0 ;dij = 1 , 2 , …) . Through first-principle molecular dynamics we demonstrate that, for large α (above α ≃ 1), i.e., short-range interactions, Boltzmann statistics (based on the additive entropic functional SB [ P(z) ] = - k ∫ dzP(z) ln P(z)) is verified. However, for small values of α (below α ≃ 1), i.e., long-range interactions, Boltzmann statistics dramatically fails and is replaced by q-statistics (based on the nonadditive entropic functional Sq [ P(z) ] = k(1 - ∫ dz[ P(z) ]q) /(q - 1) , with S1 =SB). Indeed, the one-body energy distribution is q-exponential, P(ε) ∼ eqε-βε ε ≡[ 1 +(qε - 1) βε ε ]-1 /(qε - 1) with qε > 1, and its velocity distribution is given by P(v) ∼ eqv-βvv2 / 2 with qv > 1. Moreover, within small error bars, we verify qε =qv = q, which decreases from an extrapolated value q ≃ 5 / 3 to q = 1 when α increases from zero to α ≃ 1, and remains q = 1 thereafter.

  10. Statistical inference for template aging

    NASA Astrophysics Data System (ADS)

    Schuckers, Michael E.

    2006-04-01

    A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.

  11. Ordinary least squares regression is indicated for studies of allometry.

    PubMed

    Kilmer, J T; Rodríguez, R L

    2017-01-01

    When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  12. Estimates of fetch-induced errors in Bowen-ratio energy-budget measurements of evapotranspiration from a prairie wetland, Cottonwood Lake Area, North Dakota, USA

    USGS Publications Warehouse

    Stannard, David L.; Rosenberry, Donald O.; Winter, Thomas C.; Parkhurst, Renee S.

    2004-01-01

    Micrometeorological measurements of evapotranspiration (ET) often are affected to some degree by errors arising from limited fetch. A recently developed model was used to estimate fetch-induced errors in Bowen-ratio energy-budget measurements of ET made at a small wetland with fetch-to-height ratios ranging from 34 to 49. Estimated errors were small, averaging −1.90%±0.59%. The small errors are attributed primarily to the near-zero lower sensor height, and the negative bias reflects the greater Bowen ratios of the drier surrounding upland. Some of the variables and parameters affecting the error were not measured, but instead are estimated. A sensitivity analysis indicates that the uncertainty arising from these estimates is small. In general, fetch-induced error in measured wetland ET increases with decreasing fetch-to-height ratio, with increasing aridity and with increasing atmospheric stability over the wetland. Occurrence of standing water at a site is likely to increase the appropriate time step of data integration, for a given level of accuracy. Occurrence of extensive open water can increase accuracy or decrease the required fetch by allowing the lower sensor to be placed at the water surface. If fetch is highly variable and fetch-induced errors are significant, the variables affecting fetch (e.g., wind direction, water level) need to be measured. Fetch-induced error during the non-growing season may be greater or smaller than during the growing season, depending on how seasonal changes affect both the wetland and upland at a site.

  13. SU-C-BRD-03: Analysis of Accelerator Generated Text Logs for Preemptive Maintenance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Able, CM; Baydush, AH; Nguyen, C

    2014-06-15

    Purpose: To develop a model to analyze medical accelerator generated parameter and performance data that will provide an early warning of performance degradation and impending component failure. Methods: A robust 6 MV VMAT quality assurance treatment delivery was used to test the constancy of accelerator performance. The generated text log files were decoded and analyzed using statistical process control (SPC) methodology. The text file data is a single snapshot of energy specific and overall systems parameters. A total of 36 system parameters were monitored which include RF generation, electron gun control, energy control, beam uniformity control, DC voltage generation, andmore » cooling systems. The parameters were analyzed using Individual and Moving Range (I/MR) charts. The chart limits were calculated using a hybrid technique that included the use of the standard 3σ limits and the parameter/system specification. Synthetic errors/changes were introduced to determine the initial effectiveness of I/MR charts in detecting relevant changes in operating parameters. The magnitude of the synthetic errors/changes was based on: the value of 1 standard deviation from the mean operating parameter of 483 TB systems, a small fraction (≤ 5%) of the operating range, or a fraction of the minor fault deviation. Results: There were 34 parameters in which synthetic errors were introduced. There were 2 parameters (radial position steering coil, and positive 24V DC) in which the errors did not exceed the limit of the I/MR chart. The I chart limit was exceeded for all of the remaining parameters (94.2%). The MR chart limit was exceeded in 29 of the 32 parameters (85.3%) in which the I chart limit was exceeded. Conclusion: Statistical process control I/MR evaluation of text log file parameters may be effective in providing an early warning of performance degradation or component failure for digital medical accelerator systems. Research is Supported by Varian Medical Systems, Inc.« less

  14. Development of an errorable car-following driver model

    NASA Astrophysics Data System (ADS)

    Yang, H.-H.; Peng, H.

    2010-06-01

    An errorable car-following driver model is presented in this paper. An errorable driver model is one that emulates human driver's functions and can generate both nominal (error-free), as well as devious (with error) behaviours. This model was developed for evaluation and design of active safety systems. The car-following data used for developing and validating the model were obtained from a large-scale naturalistic driving database. The stochastic car-following behaviour was first analysed and modelled as a random process. Three error-inducing behaviours were then introduced. First, human perceptual limitation was studied and implemented. Distraction due to non-driving tasks was then identified based on the statistical analysis of the driving data. Finally, time delay of human drivers was estimated through a recursive least-square identification process. By including these three error-inducing behaviours, rear-end collisions with the lead vehicle could occur. The simulated crash rate was found to be similar but somewhat higher than that reported in traffic statistics.

  15. Effects of measurement errors on psychometric measurements in ergonomics studies: Implications for correlations, ANOVA, linear regression, factor analysis, and linear discriminant analysis.

    PubMed

    Liu, Yan; Salvendy, Gavriel

    2009-05-01

    This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.

  16. On using summary statistics from an external calibration sample to correct for covariate measurement error.

    PubMed

    Guo, Ying; Little, Roderick J; McConnell, Daniel S

    2012-01-01

    Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.

  17. Error Distribution Evaluation of the Third Vanishing Point Based on Random Statistical Simulation

    NASA Astrophysics Data System (ADS)

    Li, C.

    2012-07-01

    POS, integrated by GPS / INS (Inertial Navigation Systems), has allowed rapid and accurate determination of position and attitude of remote sensing equipment for MMS (Mobile Mapping Systems). However, not only does INS have system error, but also it is very expensive. Therefore, in this paper error distributions of vanishing points are studied and tested in order to substitute INS for MMS in some special land-based scene, such as ground façade where usually only two vanishing points can be detected. Thus, the traditional calibration approach based on three orthogonal vanishing points is being challenged. In this article, firstly, the line clusters, which parallel to each others in object space and correspond to the vanishing points, are detected based on RANSAC (Random Sample Consensus) and parallelism geometric constraint. Secondly, condition adjustment with parameters is utilized to estimate nonlinear error equations of two vanishing points (VX, VY). How to set initial weights for the adjustment solution of single image vanishing points is presented. Solving vanishing points and estimating their error distributions base on iteration method with variable weights, co-factor matrix and error ellipse theory. Thirdly, under the condition of known error ellipses of two vanishing points (VX, VY) and on the basis of the triangle geometric relationship of three vanishing points, the error distribution of the third vanishing point (VZ) is calculated and evaluated by random statistical simulation with ignoring camera distortion. Moreover, Monte Carlo methods utilized for random statistical estimation are presented. Finally, experimental results of vanishing points coordinate and their error distributions are shown and analyzed.

  18. A Modeling Framework for Optimal Computational Resource Allocation Estimation: Considering the Trade-offs between Physical Resolutions, Uncertainty and Computational Costs

    NASA Astrophysics Data System (ADS)

    Moslehi, M.; de Barros, F.; Rajagopal, R.

    2014-12-01

    Hydrogeological models that represent flow and transport in subsurface domains are usually large-scale with excessive computational complexity and uncertain characteristics. Uncertainty quantification for predicting flow and transport in heterogeneous formations often entails utilizing a numerical Monte Carlo framework, which repeatedly simulates the model according to a random field representing hydrogeological characteristics of the field. The physical resolution (e.g. grid resolution associated with the physical space) for the simulation is customarily chosen based on recommendations in the literature, independent of the number of Monte Carlo realizations. This practice may lead to either excessive computational burden or inaccurate solutions. We propose an optimization-based methodology that considers the trade-off between the following conflicting objectives: time associated with computational costs, statistical convergence of the model predictions and physical errors corresponding to numerical grid resolution. In this research, we optimally allocate computational resources by developing a modeling framework for the overall error based on a joint statistical and numerical analysis and optimizing the error model subject to a given computational constraint. The derived expression for the overall error explicitly takes into account the joint dependence between the discretization error of the physical space and the statistical error associated with Monte Carlo realizations. The accuracy of the proposed framework is verified in this study by applying it to several computationally extensive examples. Having this framework at hand aims hydrogeologists to achieve the optimum physical and statistical resolutions to minimize the error with a given computational budget. Moreover, the influence of the available computational resources and the geometric properties of the contaminant source zone on the optimum resolutions are investigated. We conclude that the computational cost associated with optimal allocation can be substantially reduced compared with prevalent recommendations in the literature.

  19. Uncertainty Analysis of Seebeck Coefficient and Electrical Resistivity Characterization

    NASA Technical Reports Server (NTRS)

    Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred

    2014-01-01

    In order to provide a complete description of a materials thermoelectric power factor, in addition to the measured nominal value, an uncertainty interval is required. The uncertainty may contain sources of measurement error including systematic bias error and precision error of a statistical nature. The work focuses specifically on the popular ZEM-3 (Ulvac Technologies) measurement system, but the methods apply to any measurement system. The analysis accounts for sources of systematic error including sample preparation tolerance, measurement probe placement, thermocouple cold-finger effect, and measurement parameters; in addition to including uncertainty of a statistical nature. Complete uncertainty analysis of a measurement system allows for more reliable comparison of measurement data between laboratories.

  20. Theory of injection locking and rapid start-up of magnetrons, and effects of manufacturing errors in terahertz traveling wave tubes

    NASA Astrophysics Data System (ADS)

    Pengvanich, Phongphaeth

    In this thesis, several contemporary issues on coherent radiation sources are examined. They include the fast startup and the injection locking of microwave magnetrons, and the effects of random manufacturing errors on phase and small signal gain of terahertz traveling wave amplifiers. In response to the rapid startup and low noise magnetron experiments performed at the University of Michigan that employed periodic azimuthal perturbations in the axial magnetic field, a systematic study of single particle orbits is performed for a crossed electric and periodic magnetic field. A parametric instability in the orbits, which brings a fraction of the electrons from the cathode toward the anode, is discovered. This offers an explanation of the rapid startup observed in the experiments. A phase-locking model has been constructed from circuit theory to qualitatively explain various regimes observed in kilowatt magnetron injection-locking experiments, which were performed at the University of Michigan. These experiments utilize two continuous-wave magnetrons; one functions as an oscillator and the other as a driver. Time and frequency domain solutions are developed from the model, allowing investigations into growth, saturation, and frequency response of the output. The model qualitatively recovers many of the phase-locking frequency characteristics observed in the experiments. Effects of frequency chirp and frequency perturbation on the phase and lockability have also been quantified. Development of traveling wave amplifier operating at terahertz is a subject of current interest. The small circuit size has prompted a statistical analysis of the effects of random fabrication errors on phase and small signal gain of these amplifiers. The small signal theory is treated with a continuum model in which the electron beam is monoenergetic. Circuit perturbations that vary randomly along the beam axis are introduced through the dimensionless Pierce parameters describing the beam-wave velocity mismatch (b), the gain parameter (C), and the cold tube circuit loss ( d). Our study shows that perturbation in b dominates the other two in terms of power gain and phase shift. Extensive data show that standard deviation of the output phase is linearly proportional to standard deviation of the individual perturbations in b, C and d.

  1. Techniques for detecting effects of urban and rural land-use practices on stream-water chemistry in selected watersheds in Texas, Minnesota,and Illinois

    USGS Publications Warehouse

    Walker, J.F.

    1993-01-01

    Selected statistical techniques were applied to three urban watersheds in Texas and Minnesota and three rural watersheds in Illinois. For the urban watersheds, single- and paired-site data-collection strategies were considered. The paired-site strategy was much more effective than the singlesite strategy for detecting changes. Analysis of storm load regression residuals demonstrated the potential utility of regressions for variability reduction. For the rural watersheds, none of the selected techniques were effective at identifying changes, primarily due to a small degree of management-practice implementation, potential errors introduced through the estimation of storm load, and small sample sizes. A Monte Carlo sensitivity analysis was used to determine the percent change in water chemistry that could be detected for each watershed. In most instances, the use of regressions improved the ability to detect changes.

  2. Communication: Calculation of interatomic forces and optimization of molecular geometry with auxiliary-field quantum Monte Carlo

    NASA Astrophysics Data System (ADS)

    Motta, Mario; Zhang, Shiwei

    2018-05-01

    We propose an algorithm for accurate, systematic, and scalable computation of interatomic forces within the auxiliary-field quantum Monte Carlo (AFQMC) method. The algorithm relies on the Hellmann-Feynman theorem and incorporates Pulay corrections in the presence of atomic orbital basis sets. We benchmark the method for small molecules by comparing the computed forces with the derivatives of the AFQMC potential energy surface and by direct comparison with other quantum chemistry methods. We then perform geometry optimizations using the steepest descent algorithm in larger molecules. With realistic basis sets, we obtain equilibrium geometries in agreement, within statistical error bars, with experimental values. The increase in computational cost for computing forces in this approach is only a small prefactor over that of calculating the total energy. This paves the way for a general and efficient approach for geometry optimization and molecular dynamics within AFQMC.

  3. Adjusting for radiotelemetry error to improve estimates of habitat use.

    Treesearch

    Scott L. Findholt; Bruce K. Johnson; Lyman L. McDonald; John W. Kern; Alan Ager; Rosemary J. Stussy; Larry D. Bryant

    2002-01-01

    Animal locations estimated from radiotelemetry have traditionally been treated as error-free when analyzed in relation to habitat variables. Location error lowers the power of statistical tests of habitat selection. We describe a method that incorporates the error surrounding point estimates into measures of environmental variables determined from a geographic...

  4. Trans-dimensional matched-field geoacoustic inversion with hierarchical error models and interacting Markov chains.

    PubMed

    Dettmer, Jan; Dosso, Stan E

    2012-10-01

    This paper develops a trans-dimensional approach to matched-field geoacoustic inversion, including interacting Markov chains to improve efficiency and an autoregressive model to account for correlated errors. The trans-dimensional approach and hierarchical seabed model allows inversion without assuming any particular parametrization by relaxing model specification to a range of plausible seabed models (e.g., in this case, the number of sediment layers is an unknown parameter). Data errors are addressed by sampling statistical error-distribution parameters, including correlated errors (covariance), by applying a hierarchical autoregressive error model. The well-known difficulty of low acceptance rates for trans-dimensional jumps is addressed with interacting Markov chains, resulting in a substantial increase in efficiency. The trans-dimensional seabed model and the hierarchical error model relax the degree of prior assumptions required in the inversion, resulting in substantially improved (more realistic) uncertainty estimates and a more automated algorithm. In particular, the approach gives seabed parameter uncertainty estimates that account for uncertainty due to prior model choice (layering and data error statistics). The approach is applied to data measured on a vertical array in the Mediterranean Sea.

  5. Analysis and meta-analysis of single-case designs: an introduction.

    PubMed

    Shadish, William R

    2014-04-01

    The last 10 years have seen great progress in the analysis and meta-analysis of single-case designs (SCDs). This special issue includes five articles that provide an overview of current work on that topic, including standardized mean difference statistics, multilevel models, Bayesian statistics, and generalized additive models. Each article analyzes a common example across articles and presents syntax or macros for how to do them. These articles are followed by commentaries from single-case design researchers and journal editors. This introduction briefly describes each article and then discusses several issues that must be addressed before we can know what analyses will eventually be best to use in SCD research. These issues include modeling trend, modeling error covariances, computing standardized effect size estimates, assessing statistical power, incorporating more accurate models of outcome distributions, exploring whether Bayesian statistics can improve estimation given the small samples common in SCDs, and the need for annotated syntax and graphical user interfaces that make complex statistics accessible to SCD researchers. The article then discusses reasons why SCD researchers are likely to incorporate statistical analyses into their research more often in the future, including changing expectations and contingencies regarding SCD research from outside SCD communities, changes and diversity within SCD communities, corrections of erroneous beliefs about the relationship between SCD research and statistics, and demonstrations of how statistics can help SCD researchers better meet their goals. Copyright © 2013 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  6. The role of ensemble-based statistics in variational assimilation of cloud-affected observations from infrared imagers

    NASA Astrophysics Data System (ADS)

    Hacker, Joshua; Vandenberghe, Francois; Jung, Byoung-Jo; Snyder, Chris

    2017-04-01

    Effective assimilation of cloud-affected radiance observations from space-borne imagers, with the aim of improving cloud analysis and forecasting, has proven to be difficult. Large observation biases, nonlinear observation operators, and non-Gaussian innovation statistics present many challenges. Ensemble-variational data assimilation (EnVar) systems offer the benefits of flow-dependent background error statistics from an ensemble, and the ability of variational minimization to handle nonlinearity. The specific benefits of ensemble statistics, relative to static background errors more commonly used in variational systems, have not been quantified for the problem of assimilating cloudy radiances. A simple experiment framework is constructed with a regional NWP model and operational variational data assimilation system, to provide the basis understanding the importance of ensemble statistics in cloudy radiance assimilation. Restricting the observations to those corresponding to clouds in the background forecast leads to innovations that are more Gaussian. The number of large innovations is reduced compared to the more general case of all observations, but not eliminated. The Huber norm is investigated to handle the fat tails of the distributions, and allow more observations to be assimilated without the need for strict background checks that eliminate them. Comparing assimilation using only ensemble background error statistics with assimilation using only static background error statistics elucidates the importance of the ensemble statistics. Although the cost functions in both experiments converge to similar values after sufficient outer-loop iterations, the resulting cloud water, ice, and snow content are greater in the ensemble-based analysis. The subsequent forecasts from the ensemble-based analysis also retain more condensed water species, indicating that the local environment is more supportive of clouds. In this presentation we provide details that explain the apparent benefit from using ensembles for cloudy radiance assimilation in an EnVar context.

  7. Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data

    PubMed Central

    Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders

    2017-01-01

    The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1–10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates. PMID:29196497

  8. On the Statistical Errors of RADAR Location Sensor Networks with Built-In Wi-Fi Gaussian Linear Fingerprints

    PubMed Central

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis. PMID:22737027

  9. On the statistical errors of RADAR location sensor networks with built-in Wi-Fi Gaussian linear fingerprints.

    PubMed

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis.

  10. Selecting a Classification Ensemble and Detecting Process Drift in an Evolving Data Stream

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heredia-Langner, Alejandro; Rodriguez, Luke R.; Lin, Andy

    2015-09-30

    We characterize the commercial behavior of a group of companies in a common line of business using a small ensemble of classifiers on a stream of records containing commercial activity information. This approach is able to effectively find a subset of classifiers that can be used to predict company labels with reasonable accuracy. Performance of the ensemble, its error rate under stable conditions, can be characterized using an exponentially weighted moving average (EWMA) statistic. The behavior of the EWMA statistic can be used to monitor a record stream from the commercial network and determine when significant changes have occurred. Resultsmore » indicate that larger classification ensembles may not necessarily be optimal, pointing to the need to search the combinatorial classifier space in a systematic way. Results also show that current and past performance of an ensemble can be used to detect when statistically significant changes in the activity of the network have occurred. The dataset used in this work contains tens of thousands of high level commercial activity records with continuous and categorical variables and hundreds of labels, making classification challenging.« less

  11. Valid statistical inference methods for a case-control study with missing data.

    PubMed

    Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun

    2018-04-01

    The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.

  12. Statistical learning from nonrecurrent experience with discrete input variables and recursive-error-minimization equations

    NASA Astrophysics Data System (ADS)

    Carter, Jeffrey R.; Simon, Wayne E.

    1990-08-01

    Neural networks are trained using Recursive Error Minimization (REM) equations to perform statistical classification. Using REM equations with continuous input variables reduces the required number of training experiences by factors of one to two orders of magnitude over standard back propagation. Replacing the continuous input variables with discrete binary representations reduces the number of connections by a factor proportional to the number of variables reducing the required number of experiences by another order of magnitude. Undesirable effects of using recurrent experience to train neural networks for statistical classification problems are demonstrated and nonrecurrent experience used to avoid these undesirable effects. 1. THE 1-41 PROBLEM The statistical classification problem which we address is is that of assigning points in ddimensional space to one of two classes. The first class has a covariance matrix of I (the identity matrix) the covariance matrix of the second class is 41. For this reason the problem is known as the 1-41 problem. Both classes have equal probability of occurrence and samples from both classes may appear anywhere throughout the ddimensional space. Most samples near the origin of the coordinate system will be from the first class while most samples away from the origin will be from the second class. Since the two classes completely overlap it is impossible to have a classifier with zero error. The minimum possible error is known as the Bayes error and

  13. Estimating the Magnitude and Frequency of Floods in Small Urban Streams in South Carolina, 2001

    USGS Publications Warehouse

    Feaster, Toby D.; Guimaraes, Wladimir B.

    2004-01-01

    The magnitude and frequency of floods at 20 streamflowgaging stations on small, unregulated urban streams in or near South Carolina were estimated by fitting the measured wateryear peak flows to a log-Pearson Type-III distribution. The period of record (through September 30, 2001) for the measured water-year peak flows ranged from 11 to 25 years with a mean and median length of 16 years. The drainage areas of the streamflow-gaging stations ranged from 0.18 to 41 square miles. Based on the flood-frequency estimates from the 20 streamflow-gaging stations (13 in South Carolina; 4 in North Carolina; and 3 in Georgia), generalized least-squares regression was used to develop regional regression equations. These equations can be used to estimate the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence-interval flows for small urban streams in the Piedmont, upper Coastal Plain, and lower Coastal Plain physiographic provinces of South Carolina. The most significant explanatory variables from this analysis were mainchannel length, percent impervious area, and basin development factor. Mean standard errors of prediction for the regression equations ranged from -25 to 33 percent for the 10-year recurrence-interval flows and from -35 to 54 percent for the 100-year recurrence-interval flows. The U.S. Geological Survey has developed a Geographic Information System application called StreamStats that makes the process of computing streamflow statistics at ungaged sites faster and more consistent than manual methods. This application was developed in the Massachusetts District and ongoing work is being done in other districts to develop a similar application using streamflow statistics relative to those respective States. Considering the future possibility of implementing StreamStats in South Carolina, an alternative set of regional regression equations was developed using only main channel length and impervious area. This was done because no digital coverages are currently available for basin development factor and, therefore, it could not be included in the StreamStats application. The average mean standard error of prediction for the alternative equations was 2 to 5 percent larger than the standard errors for the equations that contained basin development factor. For the urban streamflow-gaging stations in South Carolina, measured water-year peak flows were compared with those from an earlier urban flood-frequency investigation. The peak flows from the earlier investigation were computed using a rainfall-runoff model. At many of the sites, graphical comparisons indicated that the variance of the measured data was much less than the variance of the simulated data. Several statistical tests were applied to compare the variances and the means of the measured and simulated data for each site. The results indicated that the variances were significantly different for 11 of the 13 South Carolina streamflow-gaging stations. For one streamflow-gaging station, the test for normality, which is one of the assumptions of the data when comparing variances, indicated that neither the measured data nor the simulated data were distributed normally; therefore, the test for differences in the variances was not used for that streamflow-gaging station. Another statistical test was used to test for statistically significant differences in the means of the measured and simulated data. The results indicated that for 5 of the 13 urban streamflowgaging stations in South Carolina there was a statistically significant difference in the means of the two data sets. For comparison purposes and to test the hypothesis that there may have been climatic differences between the period in which the measured peak-flow data were measured and the period for which historic rainfall data were used to compute the simulated peak flows, 16 rural streamflow-gaging stations with long-term records were reviewed using similar techniques as those used for the measured an

  14. A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

    ERIC Educational Resources Information Center

    Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T.

    2010-01-01

    This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…

  15. Impact of transport model errors on the global and regional methane emissions estimated by inverse modelling

    DOE PAGES

    Locatelli, R.; Bousquet, P.; Chevallier, F.; ...

    2013-10-08

    A modelling experiment has been conceived to assess the impact of transport model errors on methane emissions estimated in an atmospheric inversion system. Synthetic methane observations, obtained from 10 different model outputs from the international TransCom-CH4 model inter-comparison exercise, are combined with a prior scenario of methane emissions and sinks, and integrated into the three-component PYVAR-LMDZ-SACS (PYthon VARiational-Laboratoire de Météorologie Dynamique model with Zooming capability-Simplified Atmospheric Chemistry System) inversion system to produce 10 different methane emission estimates at the global scale for the year 2005. The same methane sinks, emissions and initial conditions have been applied to produce the 10more » synthetic observation datasets. The same inversion set-up (statistical errors, prior emissions, inverse procedure) is then applied to derive flux estimates by inverse modelling. Consequently, only differences in the modelling of atmospheric transport may cause differences in the estimated fluxes. Here in our framework, we show that transport model errors lead to a discrepancy of 27 Tg yr -1 at the global scale, representing 5% of total methane emissions. At continental and annual scales, transport model errors are proportionally larger than at the global scale, with errors ranging from 36 Tg yr -1 in North America to 7 Tg yr -1 in Boreal Eurasia (from 23 to 48%, respectively). At the model grid-scale, the spread of inverse estimates can reach 150% of the prior flux. Therefore, transport model errors contribute significantly to overall uncertainties in emission estimates by inverse modelling, especially when small spatial scales are examined. Sensitivity tests have been carried out to estimate the impact of the measurement network and the advantage of higher horizontal resolution in transport models. The large differences found between methane flux estimates inferred in these different configurations highly question the consistency of transport model errors in current inverse systems.« less

  16. Photospheric Magnetic Field Properties of Flaring versus Flare-quiet Active Regions. II. Discriminant Analysis

    NASA Astrophysics Data System (ADS)

    Leka, K. D.; Barnes, G.

    2003-10-01

    We apply statistical tests based on discriminant analysis to the wide range of photospheric magnetic parameters described in a companion paper by Leka & Barnes, with the goal of identifying those properties that are important for the production of energetic events such as solar flares. The photospheric vector magnetic field data from the University of Hawai'i Imaging Vector Magnetograph are well sampled both temporally and spatially, and we include here data covering 24 flare-event and flare-quiet epochs taken from seven active regions. The mean value and rate of change of each magnetic parameter are treated as separate variables, thus evaluating both the parameter's state and its evolution, to determine which properties are associated with flaring. Considering single variables first, Hotelling's T2-tests show small statistical differences between flare-producing and flare-quiet epochs. Even pairs of variables considered simultaneously, which do show a statistical difference for a number of properties, have high error rates, implying a large degree of overlap of the samples. To better distinguish between flare-producing and flare-quiet populations, larger numbers of variables are simultaneously considered; lower error rates result, but no unique combination of variables is clearly the best discriminator. The sample size is too small to directly compare the predictive power of large numbers of variables simultaneously. Instead, we rank all possible four-variable permutations based on Hotelling's T2-test and look for the most frequently appearing variables in the best permutations, with the interpretation that they are most likely to be associated with flaring. These variables include an increasing kurtosis of the twist parameter and a larger standard deviation of the twist parameter, but a smaller standard deviation of the distribution of the horizontal shear angle and a horizontal field that has a smaller standard deviation but a larger kurtosis. To support the ``sorting all permutations'' method of selecting the most frequently occurring variables, we show that the results of a single 10-variable discriminant analysis are consistent with the ranking. We demonstrate that individually, the variables considered here have little ability to differentiate between flaring and flare-quiet populations, but with multivariable combinations, the populations may be distinguished.

  17. High-resolution atmospheric inversion of urban CO2 emissions during the dormant season of the Indianapolis Flux Experiment (INFLUX)

    NASA Astrophysics Data System (ADS)

    Lauvaux, Thomas; Miles, Natasha L.; Deng, Aijun; Richardson, Scott J.; Cambaliza, Maria O.; Davis, Kenneth J.; Gaudet, Brian; Gurney, Kevin R.; Huang, Jianhua; O'Keefe, Darragh; Song, Yang; Karion, Anna; Oda, Tomohiro; Patarasuk, Risa; Razlivanov, Igor; Sarmiento, Daniel; Shepson, Paul; Sweeney, Colm; Turnbull, Jocelyn; Wu, Kai

    2016-05-01

    Based on a uniquely dense network of surface towers measuring continuously the atmospheric concentrations of greenhouse gases (GHGs), we developed the first comprehensive monitoring systems of CO2 emissions at high resolution over the city of Indianapolis. The urban inversion evaluated over the 2012-2013 dormant season showed a statistically significant increase of about 20% (from 4.5 to 5.7 MtC ± 0.23 MtC) compared to the Hestia CO2 emission estimate, a state-of-the-art building-level emission product. Spatial structures in prior emission errors, mostly undetermined, appeared to affect the spatial pattern in the inverse solution and the total carbon budget over the entire area by up to 15%, while the inverse solution remains fairly insensitive to the CO2 boundary inflow and to the different prior emissions (i.e., ODIAC). Preceding the surface emission optimization, we improved the atmospheric simulations using a meteorological data assimilation system also informing our Bayesian inversion system through updated observations error variances. Finally, we estimated the uncertainties associated with undetermined parameters using an ensemble of inversions. The total CO2 emissions based on the ensemble mean and quartiles (5.26-5.91 MtC) were statistically different compared to the prior total emissions (4.1 to 4.5 MtC). Considering the relatively small sensitivity to the different parameters, we conclude that atmospheric inversions are potentially able to constrain the carbon budget of the city, assuming sufficient data to measure the inflow of GHG over the city, but additional information on prior emission error structures are required to determine the spatial structures of urban emissions at high resolution.

  18. How large are the consequences of covariate imbalance in cluster randomized trials: a simulation study with a continuous outcome and a binary covariate at the cluster level.

    PubMed

    Moerbeek, Mirjam; van Schie, Sander

    2016-07-11

    The number of clusters in a cluster randomized trial is often low. It is therefore likely random assignment of clusters to treatment conditions results in covariate imbalance. There are no studies that quantify the consequences of covariate imbalance in cluster randomized trials on parameter and standard error bias and on power to detect treatment effects. The consequences of covariance imbalance in unadjusted and adjusted linear mixed models are investigated by means of a simulation study. The factors in this study are the degree of imbalance, the covariate effect size, the cluster size and the intraclass correlation coefficient. The covariate is binary and measured at the cluster level; the outcome is continuous and measured at the individual level. The results show covariate imbalance results in negligible parameter bias and small standard error bias in adjusted linear mixed models. Ignoring the possibility of covariate imbalance while calculating the sample size at the cluster level may result in a loss in power of at most 25 % in the adjusted linear mixed model. The results are more severe for the unadjusted linear mixed model: parameter biases up to 100 % and standard error biases up to 200 % may be observed. Power levels based on the unadjusted linear mixed model are often too low. The consequences are most severe for large clusters and/or small intraclass correlation coefficients since then the required number of clusters to achieve a desired power level is smallest. The possibility of covariate imbalance should be taken into account while calculating the sample size of a cluster randomized trial. Otherwise more sophisticated methods to randomize clusters to treatments should be used, such as stratification or balance algorithms. All relevant covariates should be carefully identified, be actually measured and included in the statistical model to avoid severe levels of parameter and standard error bias and insufficient power levels.

  19. Uncertainty in biological monitoring: a framework for data collection and analysis to account for multiple sources of sampling bias

    USGS Publications Warehouse

    Ruiz-Gutierrez, Viviana; Hooten, Melvin B.; Campbell Grant, Evan H.

    2016-01-01

    Biological monitoring programmes are increasingly relying upon large volumes of citizen-science data to improve the scope and spatial coverage of information, challenging the scientific community to develop design and model-based approaches to improve inference.Recent statistical models in ecology have been developed to accommodate false-negative errors, although current work points to false-positive errors as equally important sources of bias. This is of particular concern for the success of any monitoring programme given that rates as small as 3% could lead to the overestimation of the occurrence of rare events by as much as 50%, and even small false-positive rates can severely bias estimates of occurrence dynamics.We present an integrated, computationally efficient Bayesian hierarchical model to correct for false-positive and false-negative errors in detection/non-detection data. Our model combines independent, auxiliary data sources with field observations to improve the estimation of false-positive rates, when a subset of field observations cannot be validated a posteriori or assumed as perfect. We evaluated the performance of the model across a range of occurrence rates, false-positive and false-negative errors, and quantity of auxiliary data.The model performed well under all simulated scenarios, and we were able to identify critical auxiliary data characteristics which resulted in improved inference. We applied our false-positive model to a large-scale, citizen-science monitoring programme for anurans in the north-eastern United States, using auxiliary data from an experiment designed to estimate false-positive error rates. Not correcting for false-positive rates resulted in biased estimates of occupancy in 4 of the 10 anuran species we analysed, leading to an overestimation of the average number of occupied survey routes by as much as 70%.The framework we present for data collection and analysis is able to efficiently provide reliable inference for occurrence patterns using data from a citizen-science monitoring programme. However, our approach is applicable to data generated by any type of research and monitoring programme, independent of skill level or scale, when effort is placed on obtaining auxiliary information on false-positive rates.

  20. Statistical model to perform error analysis of curve fits of wind tunnel test data using the techniques of analysis of variance and regression analysis

    NASA Technical Reports Server (NTRS)

    Alston, D. W.

    1981-01-01

    The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.

  1. Parameter optimization in biased decoy-state quantum key distribution with both source errors and statistical fluctuations

    NASA Astrophysics Data System (ADS)

    Zhu, Jian-Rong; Li, Jian; Zhang, Chun-Mei; Wang, Qin

    2017-10-01

    The decoy-state method has been widely used in commercial quantum key distribution (QKD) systems. In view of the practical decoy-state QKD with both source errors and statistical fluctuations, we propose a universal model of full parameter optimization in biased decoy-state QKD with phase-randomized sources. Besides, we adopt this model to carry out simulations of two widely used sources: weak coherent source (WCS) and heralded single-photon source (HSPS). Results show that full parameter optimization can significantly improve not only the secure transmission distance but also the final key generation rate. And when taking source errors and statistical fluctuations into account, the performance of decoy-state QKD using HSPS suffered less than that of decoy-state QKD using WCS.

  2. Improving the quality of mass produced maps

    USGS Publications Warehouse

    Simley, J.

    2001-01-01

    Quality is critical in cartography because key decisions are often made based on the information the map communicates. The mass production of digital cartographic information to support geographic information science has now added a new dimension to the problem of cartographic quality, as problems once limited to small volumes can now proliferate in mass production programs. These problems can also affect the economics of map production by diverting a sizeable portion of production cost to pay for rework on maps with poor quality. Such problems are common to general industry-in response, the quality engineering profession has developed a number of successful methods to overcome these problems. Two important methods are the reduction of error through statistical analysis and addressing the quality environment in which people work. Once initial and obvious quality problems have been solved, outside influences periodically appear that cause adverse variations in quality and consequently increase production costs. Such errors can be difficult to detect before the customer is affected. However, a number of statistical techniques can be employed to detect variation so that the problem is eliminated before significant damage is caused. Additionally, the environment in which the workforce operates must be conductive to quality. Managers have a powerful responsibility to create this environment. Two sets of guidelines, known as Deming's Fourteen Points and ISO-9000, provide models for this environment.

  3. On the robustness of bucket brigade quantum RAM

    NASA Astrophysics Data System (ADS)

    Arunachalam, Srinivasan; Gheorghiu, Vlad; Jochym-O'Connor, Tomas; Mosca, Michele; Varshinee Srinivasan, Priyaa

    2015-12-01

    We study the robustness of the bucket brigade quantum random access memory model introduced by Giovannetti et al (2008 Phys. Rev. Lett.100 160501). Due to a result of Regev and Schiff (ICALP ’08 733), we show that for a class of error models the error rate per gate in the bucket brigade quantum memory has to be of order o({2}-n/2) (where N={2}n is the size of the memory) whenever the memory is used as an oracle for the quantum searching problem. We conjecture that this is the case for any realistic error model that will be encountered in practice, and that for algorithms with super-polynomially many oracle queries the error rate must be super-polynomially small, which further motivates the need for quantum error correction. By contrast, for algorithms such as matrix inversion Harrow et al (2009 Phys. Rev. Lett.103 150502) or quantum machine learning Rebentrost et al (2014 Phys. Rev. Lett.113 130503) that only require a polynomial number of queries, the error rate only needs to be polynomially small and quantum error correction may not be required. We introduce a circuit model for the quantum bucket brigade architecture and argue that quantum error correction for the circuit causes the quantum bucket brigade architecture to lose its primary advantage of a small number of ‘active’ gates, since all components have to be actively error corrected.

  4. Sampling errors for satellite-derived tropical rainfall - Monte Carlo study using a space-time stochastic model

    NASA Technical Reports Server (NTRS)

    Bell, Thomas L.; Abdullah, A.; Martin, Russell L.; North, Gerald R.

    1990-01-01

    Estimates of monthly average rainfall based on satellite observations from a low earth orbit will differ from the true monthly average because the satellite observes a given area only intermittently. This sampling error inherent in satellite monitoring of rainfall would occur even if the satellite instruments could measure rainfall perfectly. The size of this error is estimated for a satellite system being studied at NASA, the Tropical Rainfall Measuring Mission (TRMM). First, the statistical description of rainfall on scales from 1 to 1000 km is examined in detail, based on rainfall data from the Global Atmospheric Research Project Atlantic Tropical Experiment (GATE). A TRMM-like satellite is flown over a two-dimensional time-evolving simulation of rainfall using a stochastic model with statistics tuned to agree with GATE statistics. The distribution of sampling errors found from many months of simulated observations is found to be nearly normal, even though the distribution of area-averaged rainfall is far from normal. For a range of orbits likely to be employed in TRMM, sampling error is found to be less than 10 percent of the mean for rainfall averaged over a 500 x 500 sq km area.

  5. Discovery of error-tolerant biclusters from noisy gene expression data.

    PubMed

    Gupta, Rohit; Rao, Navneet; Kumar, Vipin

    2011-11-24

    An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, which limits its applicability in real-life data sets where the biclusters may be fragmented due to random noise/errors. Moreover, as they only work with binary or boolean attributes, their application on gene-expression data require transforming real-valued attributes to binary attributes, which often results in loss of information. Many past approaches have tried to address the issue of noise and handling real-valued attributes independently but there is no systematic approach that addresses both of these issues together. In this paper, we first propose a novel error-tolerant biclustering model, 'ET-bicluster', and then propose a bottom-up heuristic-based mining algorithm to sequentially discover error-tolerant biclusters directly from real-valued gene-expression data. The efficacy of our proposed approach is illustrated by comparing it with a recent approach RAP in the context of two biological problems: discovery of functional modules and discovery of biomarkers. For the first problem, two real-valued S.Cerevisiae microarray gene-expression data sets are used to demonstrate that the biclusters obtained from ET-bicluster approach not only recover larger set of genes as compared to those obtained from RAP approach but also have higher functional coherence as evaluated using the GO-based functional enrichment analysis. The statistical significance of the discovered error-tolerant biclusters as estimated by using two randomization tests, reveal that they are indeed biologically meaningful and statistically significant. For the second problem of biomarker discovery, we used four real-valued Breast Cancer microarray gene-expression data sets and evaluate the biomarkers obtained using MSigDB gene sets. The results obtained for both the problems: functional module discovery and biomarkers discovery, clearly signifies the usefulness of the proposed ET-bicluster approach and illustrate the importance of explicitly incorporating noise/errors in discovering coherent groups of genes from gene-expression data.

  6. On the statistical assessment of classifiers using DNA microarray data

    PubMed Central

    Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

    2006-01-01

    Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171

  7. Backus Effect on a Perpendicular Errors in Harmonic Models of Real vs. Synthetic Data

    NASA Technical Reports Server (NTRS)

    Voorhies, C. V.; Santana, J.; Sabaka, T.

    1999-01-01

    Measurements of geomagnetic scalar intensity on a thin spherical shell alone are not enough to separate internal from external source fields; moreover, such scalar data are not enough for accurate modeling of the vector field from internal sources because of unmodeled fields and small data errors. Spherical harmonic models of the geomagnetic potential fitted to scalar data alone therefore suffer from well-understood Backus effect and perpendicular errors. Curiously, errors in some models of simulated 'data' are very much less than those in models of real data. We analyze select Magsat vector and scalar measurements separately to illustrate Backus effect and perpendicular errors in models of real scalar data. By using a model to synthesize 'data' at the observation points, and by adding various types of 'noise', we illustrate such errors in models of synthetic 'data'. Perpendicular errors prove quite sensitive to the maximum degree in the spherical harmonic expansion of the potential field model fitted to the scalar data. Small errors in models of synthetic 'data' are found to be an artifact of matched truncation levels. For example, consider scalar synthetic 'data' computed from a degree 14 model. A degree 14 model fitted to such synthetic 'data' yields negligible error, but amplifies 4 nT (rmss) added noise into a 60 nT error (rmss); however, a degree 12 model fitted to the noisy 'data' suffers a 492 nT error (rmms through degree 12). Geomagnetic measurements remain unaware of model truncation, so the small errors indicated by some simulations cannot be realized in practice. Errors in models fitted to scalar data alone approach 1000 nT (rmss) and several thousand nT (maximum).

  8. Comparison of base flows to selected streamflow statistics representative of 1930-2002 in West Virginia

    USGS Publications Warehouse

    Wiley, Jeffrey B.

    2012-01-01

    Base flows were compared with published streamflow statistics to assess climate variability and to determine the published statistics that can be substituted for annual and seasonal base flows of unregulated streams in West Virginia. The comparison study was done by the U.S. Geological Survey, in cooperation with the West Virginia Department of Environmental Protection, Division of Water and Waste Management. The seasons were defined as winter (January 1-March 31), spring (April 1-June 30), summer (July 1-September 30), and fall (October 1-December 31). Differences in mean annual base flows for five record sub-periods (1930-42, 1943-62, 1963-69, 1970-79, and 1980-2002) range from -14.9 to 14.6 percent when compared to the values for the period 1930-2002. Differences between mean seasonal base flows and values for the period 1930-2002 are less variable for winter and spring, -11.2 to 11.0 percent, than for summer and fall, -47.0 to 43.6 percent. Mean summer base flows (July-September) and mean monthly base flows for July, August, September, and October are approximately equal, within 7.4 percentage points of mean annual base flow. The mean of each of annual, spring, summer, fall, and winter base flows are approximately equal to the annual 50-percent (standard error of 10.3 percent), 45-percent (error of 14.6 percent), 75-percent (error of 11.8 percent), 55-percent (error of 11.2 percent), and 35-percent duration flows (error of 11.1 percent), respectively. The mean seasonal base flows for spring, summer, fall, and winter are approximately equal to the spring 50- to 55-percent (standard error of 6.8 percent), summer 45- to 50-percent (error of 6.7 percent), fall 45-percent (error of 15.2 percent), and winter 60-percent duration flows (error of 8.5 percent), respectively. Annual and seasonal base flows representative of the period 1930-2002 at unregulated streamflow-gaging stations and ungaged locations in West Virginia can be estimated using previously published values of statistics and procedures.

  9. Prediction of pilot reserve attention capacity during air-to-air target tracking

    NASA Technical Reports Server (NTRS)

    Onstott, E. D.; Faulkner, W. H.

    1977-01-01

    Reserve attention capacity of a pilot was calculated using a pilot model that allocates exclusive model attention according to the ranking of task urgency functions whose variables are tracking error and error rate. The modeled task consisted of tracking a maneuvering target aircraft both vertically and horizontally, and when possible, performing a diverting side task which was simulated by the precise positioning of an electrical stylus and modeled as a task of constant urgency in the attention allocation algorithm. The urgency of the single loop vertical task is simply the magnitude of the vertical tracking error, while the multiloop horizontal task requires a nonlinear urgency measure of error and error rate terms. Comparison of model results with flight simulation data verified the computed model statistics of tracking error of both axes, lateral and longitudinal stick amplitude and rate, and side task episodes. Full data for the simulation tracking statistics as well as the explicit equations and structure of the urgency function multiaxis pilot model are presented.

  10. Notes on power of normality tests of error terms in regression models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Střelec, Luboš

    2015-03-10

    Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importancemore » of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.« less

  11. Refractive errors among students occupying rooms lighted with incandescent or fluorescent lamps.

    PubMed

    Czepita, Damian; Gosławski, Wojciech; Mojsa, Artur

    2004-01-01

    The purpose of the study was to determine whether the development of refractive errors could be associated with exposure to light emitted by incandescent or fluorescent lamps. 3636 students were examined (1638 boys and 1998 girls, aged 6-18 years, mean age 12.1, SD 3.4). The examination included retinoscopy with cycloplegia. Myopia was defined as refractive error < or = -0.5 D, hyperopia as refractive error > or = +1.5 D, astigmatism as refractive error > 0.5 DC. Anisometropia was diagnosed when the difference in the refraction of both eyes was > 1.0 D. The children and their parents completed a questionnaire on exposure to light at home. Data were analyzed statistically with the chi2 test. P values of less than 0.05 were considered statistically significant. It was found that the use of fluorescent lamps was associated with an increase in the occurrence of hyperopia (P < 0.01). There was no association between sleeping with the light turned on and prevalence of refractive errors.

  12. Design of association studies with pooled or un-pooled next-generation sequencing data.

    PubMed

    Kim, Su Yeon; Li, Yingrui; Guo, Yiran; Li, Ruiqiang; Holmkvist, Johan; Hansen, Torben; Pedersen, Oluf; Wang, Jun; Nielsen, Rasmus

    2010-07-01

    Most common hereditary diseases in humans are complex and multifactorial. Large-scale genome-wide association studies based on SNP genotyping have only identified a small fraction of the heritable variation of these diseases. One explanation may be that many rare variants (a minor allele frequency, MAF <5%), which are not included in the common genotyping platforms, may contribute substantially to the genetic variation of these diseases. Next-generation sequencing, which would allow the analysis of rare variants, is now becoming so cheap that it provides a viable alternative to SNP genotyping. In this paper, we present cost-effective protocols for using next-generation sequencing in association mapping studies based on pooled and un-pooled samples, and identify optimal designs with respect to total number of individuals, number of individuals per pool, and the sequencing coverage. We perform a small empirical study to evaluate the pooling variance in a realistic setting where pooling is combined with exon-capturing. To test for associations, we develop a likelihood ratio statistic that accounts for the high error rate of next-generation sequencing data. We also perform extensive simulations to determine the power and accuracy of this method. Overall, our findings suggest that with a fixed cost, sequencing many individuals at a more shallow depth with larger pool size achieves higher power than sequencing a small number of individuals in higher depth with smaller pool size, even in the presence of high error rates. Our results provide guidelines for researchers who are developing association mapping studies based on next-generation sequencing. (c) 2010 Wiley-Liss, Inc.

  13. Mesoscale Predictability and Error Growth in Short Range Ensemble Forecasts

    NASA Astrophysics Data System (ADS)

    Gingrich, Mark

    Although it was originally suggested that small-scale, unresolved errors corrupt forecasts at all scales through an inverse error cascade, some authors have proposed that those mesoscale circulations resulting from stationary forcing on the larger scale may inherit the predictability of the large-scale motions. Further, the relative contributions of large- and small-scale uncertainties in producing error growth in the mesoscales remain largely unknown. Here, 100 member ensemble forecasts are initialized from an ensemble Kalman filter (EnKF) to simulate two winter storms impacting the East Coast of the United States in 2010. Four verification metrics are considered: the local snow water equivalence, total liquid water, and 850 hPa temperatures representing mesoscale features; and the sea level pressure field representing a synoptic feature. It is found that while the predictability of the mesoscale features can be tied to the synoptic forecast, significant uncertainty existed on the synoptic scale at lead times as short as 18 hours. Therefore, mesoscale details remained uncertain in both storms due to uncertainties at the large scale. Additionally, the ensemble perturbation kinetic energy did not show an appreciable upscale propagation of error for either case. Instead, the initial condition perturbations from the cycling EnKF were maximized at large scales and immediately amplified at all scales without requiring initial upscale propagation. This suggests that relatively small errors in the synoptic-scale initialization may have more importance in limiting predictability than errors in the unresolved, small-scale initial conditions.

  14. Photogrammetric discharge monitoring of small tropical mountain rivers - A case study at Rivière des Pluies, Réunion island

    NASA Astrophysics Data System (ADS)

    Stumpf, André; Augereau, Emmanuel; Delacourt, Christophe; Bonnier, Julien

    2016-04-01

    Reliable discharge measurements are indispensable for an effective management of natural water resources and floods. Limitations of classical current meter profiling and stage-discharge ratings have stimulated the development of more accurate and efficient gauging techniques. While new discharge measurements technologies such as acoustic doppler current profilers and large-scale image particle velocimetry (LSPIV) have been developed and tested in numerous studies, the continuous monitoring of small mountain rivers and discharge dynamics during strong meteorological events remains challenging. More specifically LSPIV studies are often focused on short-term measurements during flood events and there are still very few studies that address its use for long-term monitoring of small mountain rivers. To fill this gap this study targets the development and testing of largely autonomous photogrammetric discharge measurement system with a special focus on the application to small mountain river with high discharge variability and a mobile riverbed in the tropics. It proposes several enhancements among previous LSPIV methods regarding camera calibration, more efficient processing in image geometry, the automatic detection of the water level as well as the statistical calibration and estimation of the discharge from multiple profiles. To account for changes in the bed topography the riverbed is surveyed repeatedly during the dry seasons using multi-view photogrammetry or terrestrial laser scanners. The presented case study comprises the analysis of several thousand videos spanning over two and a half year (2013-2015) to test the robustness and accuracy of different processing steps. An analysis of the obtained results suggests that the quality of the camera calibration reaches a sub-pixel accuracy. The median accuracy of the watermask detections is F1=0.82, whereas the precision is systematically higher than the recall. The resulting underestimation of the water surface area and level leads to a systematic underestimation of the discharge and error rates of up to 25 %. However, the bias can be effectively removed using a least-square cross-calibration which reduces the error to a MAE of 6.39% and a maximum error of 16.18%. Those error rates are significantly lower than the uncertainties among multiple profiles (30%) and illustrate the importance of the spatial averaging from multiple measurements. The study suggests that LSPIV can already be considered as a valuable tool for the monitoring of torrential flows, whereas further research is still needed to fully integrate night-time observation and stereo-photogrammetric capabilities.

  15. Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.

    PubMed

    Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia

    2017-06-01

    Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.

  16. Quantified Choice of Root-Mean-Square Errors of Approximation for Evaluation and Power Analysis of Small Differences between Structural Equation Models

    ERIC Educational Resources Information Center

    Li, Libo; Bentler, Peter M.

    2011-01-01

    MacCallum, Browne, and Cai (2006) proposed a new framework for evaluation and power analysis of small differences between nested structural equation models (SEMs). In their framework, the null and alternative hypotheses for testing a small difference in fit and its related power analyses were defined by some chosen root-mean-square error of…

  17. Comparison of Kalman filter and optimal smoother estimates of spacecraft attitude

    NASA Technical Reports Server (NTRS)

    Sedlak, J.

    1994-01-01

    Given a valid system model and adequate observability, a Kalman filter will converge toward the true system state with error statistics given by the estimated error covariance matrix. The errors generally do not continue to decrease. Rather, a balance is reached between the gain of information from new measurements and the loss of information during propagation. The errors can be further reduced, however, by a second pass through the data with an optimal smoother. This algorithm obtains the optimally weighted average of forward and backward propagating Kalman filters. It roughly halves the error covariance by including future as well as past measurements in each estimate. This paper investigates whether such benefits actually accrue in the application of an optimal smoother to spacecraft attitude determination. Tests are performed both with actual spacecraft data from the Extreme Ultraviolet Explorer (EUVE) and with simulated data for which the true state vector and noise statistics are exactly known.

  18. Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data.

    PubMed

    Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders

    2018-02-02

    The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1-10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates. Copyright © 2018 Soraggi et al.

  19. The R-factor gap in macromolecular crystallography: an untapped potential for insights on accurate structures.

    PubMed

    Holton, James M; Classen, Scott; Frankel, Kenneth A; Tainer, John A

    2014-09-01

    In macromolecular crystallography, the agreement between observed and predicted structure factors (Rcryst and Rfree ) is seldom better than 20%. This is much larger than the estimate of experimental error (Rmerge ). The difference between Rcryst and Rmerge is the R-factor gap. There is no such gap in small-molecule crystallography, for which calculated structure factors are generally considered more accurate than the experimental measurements. Perhaps the true noise level of macromolecular data is higher than expected? Or is the gap caused by inaccurate phases that trap refined models in local minima? By generating simulated diffraction patterns using the program MLFSOM, and including every conceivable source of experimental error, we show that neither is the case. Processing our simulated data yielded values that were indistinguishable from those of real data for all crystallographic statistics except the final Rcryst and Rfree . These values decreased to 3.8% and 5.5% for simulated data, suggesting that the reason for high R-factors in macromolecular crystallography is neither experimental error nor phase bias, but rather an underlying inadequacy in the models used to explain our observations. The present inability to accurately represent the entire macromolecule with both its flexibility and its protein-solvent interface may be improved by synergies between small-angle X-ray scattering, computational chemistry and crystallography. The exciting implication of our finding is that macromolecular data contain substantial hidden and untapped potential to resolve ambiguities in the true nature of the nanoscale, a task that the second century of crystallography promises to fulfill. Coordinates and structure factors for the real data have been submitted to the Protein Data Bank under accession 4tws. © 2014 The Authors. FEBS Journal published by John Wiley & Sons Ltd on behalf of FEBS.

  20. Detecting small-study effects and funnel plot asymmetry in meta-analysis of survival data: A comparison of new and existing tests.

    PubMed

    Debray, Thomas P A; Moons, Karel G M; Riley, Richard D

    2018-03-01

    Small-study effects are a common threat in systematic reviews and may indicate publication bias. Their existence is often verified by visual inspection of the funnel plot. Formal tests to assess the presence of funnel plot asymmetry typically estimate the association between the reported effect size and their standard error, the total sample size, or the inverse of the total sample size. In this paper, we demonstrate that the application of these tests may be less appropriate in meta-analysis of survival data, where censoring influences statistical significance of the hazard ratio. We subsequently propose 2 new tests that are based on the total number of observed events and adopt a multiplicative variance component. We compare the performance of the various funnel plot asymmetry tests in an extensive simulation study where we varied the true hazard ratio (0.5 to 1), the number of published trials (N=10 to 100), the degree of censoring within trials (0% to 90%), and the mechanism leading to participant dropout (noninformative versus informative). Results demonstrate that previous well-known tests for detecting funnel plot asymmetry suffer from low power or excessive type-I error rates in meta-analysis of survival data, particularly when trials are affected by participant dropout. Because our novel test (adopting estimates of the asymptotic precision as study weights) yields reasonable power and maintains appropriate type-I error rates, we recommend its use to evaluate funnel plot asymmetry in meta-analysis of survival data. The use of funnel plot asymmetry tests should, however, be avoided when there are few trials available for any meta-analysis. © 2017 The Authors. Research Synthesis Methods Published by John Wiley & Sons, Ltd.

  1. A model for the statistical description of analytical errors occurring in clinical chemical laboratories with time.

    PubMed

    Hyvärinen, A

    1985-01-01

    The main purpose of the present study was to describe the statistical behaviour of daily analytical errors in the dimensions of place and time, providing a statistical basis for realistic estimates of the analytical error, and hence allowing the importance of the error and the relative contributions of its different sources to be re-evaluated. The observation material consists of creatinine and glucose results for control sera measured in daily routine quality control in five laboratories for a period of one year. The observation data were processed and computed by means of an automated data processing system. Graphic representations of time series of daily observations, as well as their means and dispersion limits when grouped over various time intervals, were investigated. For partition of the total variation several two-way analyses of variance were done with laboratory and various time classifications as factors. Pooled sets of observations were tested for normality of distribution and for consistency of variances, and the distribution characteristics of error variation in different categories of place and time were compared. Errors were found from the time series to vary typically between days. Due to irregular fluctuations in general and particular seasonal effects in creatinine, stable estimates of means or of dispersions for errors in individual laboratories could not be easily obtained over short periods of time but only from data sets pooled over long intervals (preferably at least one year). Pooled estimates of proportions of intralaboratory variation were relatively low (less than 33%) when the variation was pooled within days. However, when the variation was pooled over longer intervals this proportion increased considerably, even to a maximum of 89-98% (95-98% in each method category) when an outlying laboratory in glucose was omitted, with a concomitant decrease in the interaction component (representing laboratory-dependent variation with time). This indicates that a substantial part of the variation comes from intralaboratory variation with time rather than from constant interlaboratory differences. Normality and consistency of statistical distributions were best achieved in the long-term intralaboratory sets of the data, under which conditions the statistical estimates of error variability were also most characteristic of the individual laboratories rather than necessarily being similar to one another. Mixing of data from different laboratories may give heterogeneous and nonparametric distributions and hence is not advisable.(ABSTRACT TRUNCATED AT 400 WORDS)

  2. 78 FR 28597 - State Median Income Estimates for a Four-Person Household: Notice of the Federal Fiscal Year (FFY...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-15

    ....gov/acs/www/ or contact the Census Bureau's Social, Economic, and Housing Statistics Division at (301...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the... direction; and (2) Sampling Error, which consists of the error that arises from the use of probability...

  3. Small Atomic Orbital Basis Set First‐Principles Quantum Chemical Methods for Large Molecular and Periodic Systems: A Critical Analysis of Error Sources

    PubMed Central

    Sure, Rebecca; Brandenburg, Jan Gerit

    2015-01-01

    Abstract In quantum chemical computations the combination of Hartree–Fock or a density functional theory (DFT) approximation with relatively small atomic orbital basis sets of double‐zeta quality is still widely used, for example, in the popular B3LYP/6‐31G* approach. In this Review, we critically analyze the two main sources of error in such computations, that is, the basis set superposition error on the one hand and the missing London dispersion interactions on the other. We review various strategies to correct those errors and present exemplary calculations on mainly noncovalently bound systems of widely varying size. Energies and geometries of small dimers, large supramolecular complexes, and molecular crystals are covered. We conclude that it is not justified to rely on fortunate error compensation, as the main inconsistencies can be cured by modern correction schemes which clearly outperform the plain mean‐field methods. PMID:27308221

  4. Assessment of Spectral Doppler in Preclinical Ultrasound Using a Small-Size Rotating Phantom

    PubMed Central

    Yang, Xin; Sun, Chao; Anderson, Tom; Moran, Carmel M.; Hadoke, Patrick W.F.; Gray, Gillian A.; Hoskins, Peter R.

    2013-01-01

    Preclinical ultrasound scanners are used to measure blood flow in small animals, but the potential errors in blood velocity measurements have not been quantified. This investigation rectifies this omission through the design and use of phantoms and evaluation of measurement errors for a preclinical ultrasound system (Vevo 770, Visualsonics, Toronto, ON, Canada). A ray model of geometric spectral broadening was used to predict velocity errors. A small-scale rotating phantom, made from tissue-mimicking material, was developed. True and Doppler-measured maximum velocities of the moving targets were compared over a range of angles from 10° to 80°. Results indicate that the maximum velocity was overestimated by up to 158% by spectral Doppler. There was good agreement (<10%) between theoretical velocity errors and measured errors for beam-target angles of 50°–80°. However, for angles of 10°–40°, the agreement was not as good (>50%). The phantom is capable of validating the performance of blood velocity measurement in preclinical ultrasound. PMID:23711503

  5. Knowledge of healthcare professionals about medication errors in hospitals

    PubMed Central

    Abdel-Latif, Mohamed M. M.

    2016-01-01

    Context: Medication errors are the most common types of medical errors in hospitals and leading cause of morbidity and mortality among patients. Aims: The aim of the present study was to assess the knowledge of healthcare professionals about medication errors in hospitals. Settings and Design: A self-administered questionnaire was distributed to randomly selected healthcare professionals in eight hospitals in Madinah, Saudi Arabia. Subjects and Methods: An 18-item survey was designed and comprised questions on demographic data, knowledge of medication errors, availability of reporting systems in hospitals, attitudes toward error reporting, causes of medication errors. Statistical Analysis Used: Data were analyzed with Statistical Package for the Social Sciences software Version 17. Results: A total of 323 of healthcare professionals completed the questionnaire with 64.6% response rate of 138 (42.72%) physicians, 34 (10.53%) pharmacists, and 151 (46.75%) nurses. A majority of the participants had a good knowledge about medication errors concept and their dangers on patients. Only 68.7% of them were aware of reporting systems in hospitals. Healthcare professionals revealed that there was no clear mechanism available for reporting of errors in most hospitals. Prescribing (46.5%) and administration (29%) errors were the main causes of errors. The most frequently encountered medication errors were anti-hypertensives, antidiabetics, antibiotics, digoxin, and insulin. Conclusions: This study revealed differences in the awareness among healthcare professionals toward medication errors in hospitals. The poor knowledge about medication errors emphasized the urgent necessity to adopt appropriate measures to raise awareness about medication errors in Saudi hospitals. PMID:27330261

  6. Discretization effects in the topological susceptibility in lattice QCD

    NASA Astrophysics Data System (ADS)

    Hart, A.

    2004-04-01

    We study the topological susceptibility χ in QCD with two quark flavors using lattice field configurations that have been produced with an O(a)-improved clover quark action. We find clear evidence for the expected suppression at a small quark mass mq and examine the variation of χ with this mass and the lattice spacing a. A joint continuum and chiral extrapolation yields good agreement with theoretical expectations as a,mq→0. A moderate increase in autocorrelation is observed on the more chiral ensembles, but within large statistical errors. Finite volume effects are negligible for the Leutwyler-Smilga parameter xLS≳10, and no evidence for a nearby phase transition is observed.

  7. Impact of documentation errors on accuracy of cause of death coding in an educational hospital in Southern Iran.

    PubMed

    Haghighi, Mohammad Hosein Hayavi; Dehghani, Mohammad; Teshnizi, Saeid Hoseini; Mahmoodi, Hamid

    2014-01-01

    Accurate cause of death coding leads to organised and usable death information but there are some factors that influence documentation on death certificates and therefore affect the coding. We reviewed the role of documentation errors on the accuracy of death coding at Shahid Mohammadi Hospital (SMH), Bandar Abbas, Iran. We studied the death certificates of all deceased patients in SMH from October 2010 to March 2011. Researchers determined and coded the underlying cause of death on the death certificates according to the guidelines issued by the World Health Organization in Volume 2 of the International Statistical Classification of Diseases and Health Related Problems-10th revision (ICD-10). Necessary ICD coding rules (such as the General Principle, Rules 1-3, the modification rules and other instructions about death coding) were applied to select the underlying cause of death on each certificate. Demographic details and documentation errors were then extracted. Data were analysed with descriptive statistics and chi square tests. The accuracy rate of causes of death coding was 51.7%, demonstrating a statistically significant relationship (p=.001) with major errors but not such a relationship with minor errors. Factors that result in poor quality of Cause of Death coding in SMH are lack of coder training, documentation errors and the undesirable structure of death certificates.

  8. Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping

    NASA Astrophysics Data System (ADS)

    Kubica, Aleksander; Beverland, Michael E.; Brandão, Fernando; Preskill, John; Svore, Krysta M.

    2018-05-01

    Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p3DCC (1 )≃1.9 % and p3DCC (2 )≃27.6 % . We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.

  9. Fast maximum likelihood estimation using continuous-time neural point process models.

    PubMed

    Lepage, Kyle Q; MacDonald, Christopher J

    2015-06-01

    A recent report estimates that the number of simultaneously recorded neurons is growing exponentially. A commonly employed statistical paradigm using discrete-time point process models of neural activity involves the computation of a maximum-likelihood estimate. The time to computate this estimate, per neuron, is proportional to the number of bins in a finely spaced discretization of time. By using continuous-time models of neural activity and the optimally efficient Gaussian quadrature, memory requirements and computation times are dramatically decreased in the commonly encountered situation where the number of parameters p is much less than the number of time-bins n. In this regime, with q equal to the quadrature order, memory requirements are decreased from O(np) to O(qp), and the number of floating-point operations are decreased from O(np(2)) to O(qp(2)). Accuracy of the proposed estimates is assessed based upon physiological consideration, error bounds, and mathematical results describing the relation between numerical integration error and numerical error affecting both parameter estimates and the observed Fisher information. A check is provided which is used to adapt the order of numerical integration. The procedure is verified in simulation and for hippocampal recordings. It is found that in 95 % of hippocampal recordings a q of 60 yields numerical error negligible with respect to parameter estimate standard error. Statistical inference using the proposed methodology is a fast and convenient alternative to statistical inference performed using a discrete-time point process model of neural activity. It enables the employment of the statistical methodology available with discrete-time inference, but is faster, uses less memory, and avoids any error due to discretization.

  10. Statistical and systematic errors in the measurement of weak-lensing Minkowski functionals: Application to the Canada-France-Hawaii Lensing Survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shirasaki, Masato; Yoshida, Naoki, E-mail: masato.shirasaki@utap.phys.s.u-tokyo.ac.jp

    2014-05-01

    The measurement of cosmic shear using weak gravitational lensing is a challenging task that involves a number of complicated procedures. We study in detail the systematic errors in the measurement of weak-lensing Minkowski Functionals (MFs). Specifically, we focus on systematics associated with galaxy shape measurements, photometric redshift errors, and shear calibration correction. We first generate mock weak-lensing catalogs that directly incorporate the actual observational characteristics of the Canada-France-Hawaii Lensing Survey (CFHTLenS). We then perform a Fisher analysis using the large set of mock catalogs for various cosmological models. We find that the statistical error associated with the observational effects degradesmore » the cosmological parameter constraints by a factor of a few. The Subaru Hyper Suprime-Cam (HSC) survey with a sky coverage of ∼1400 deg{sup 2} will constrain the dark energy equation of the state parameter with an error of Δw {sub 0} ∼ 0.25 by the lensing MFs alone, but biases induced by the systematics can be comparable to the 1σ error. We conclude that the lensing MFs are powerful statistics beyond the two-point statistics only if well-calibrated measurement of both the redshifts and the shapes of source galaxies is performed. Finally, we analyze the CFHTLenS data to explore the ability of the MFs to break degeneracies between a few cosmological parameters. Using a combined analysis of the MFs and the shear correlation function, we derive the matter density Ω{sub m0}=0.256±{sub 0.046}{sup 0.054}.« less

  11. The statistical pitfalls of the partially randomized preference design in non-blinded trials of psychological interventions.

    PubMed

    Gemmell, Isla; Dunn, Graham

    2011-03-01

    In a partially randomized preference trial (PRPT) patients with no treatment preference are allocated to groups at random, but those who express a preference receive the treatment of their choice. It has been suggested that the design can improve the external and internal validity of trials. We used computer simulation to illustrate the impact that an unmeasured confounder could have on the results and conclusions drawn from a PRPT. We generated 4000 observations ("patients") that reflected the distribution of the Beck Depression Index (DBI) in trials of depression. Half were randomly assigned to a randomized controlled trial (RCT) design and half were assigned to a PRPT design. In the RCT, "patients" were evenly split between treatment and control groups; whereas in the preference arm, to reflect patient choice, 87.5% of patients were allocated to the experimental treatment and 12.5% to the control. Unadjusted analyses of the PRPT data consistently overestimated the treatment effect and its standard error. This lead to Type I errors when the true treatment effect was small and Type II errors when the confounder effect was large. The PRPT design is not recommended as a method of establishing an unbiased estimate of treatment effect due to the potential influence of unmeasured confounders. Copyright © 2011 John Wiley & Sons, Ltd.

  12. Fast and low-cost method for VBES bathymetry generation in coastal areas

    NASA Astrophysics Data System (ADS)

    Sánchez-Carnero, N.; Aceña, S.; Rodríguez-Pérez, D.; Couñago, E.; Fraile, P.; Freire, J.

    2012-12-01

    Sea floor topography is key information in coastal area management. Nowadays, LiDAR and multibeam technologies provide accurate bathymetries in those areas; however these methodologies are yet too expensive for small customers (fishermen associations, small research groups) willing to keep a periodic surveillance of environmental resources. In this paper, we analyse a simple methodology for vertical beam echosounder (VBES) bathymetric data acquisition and postprocessing, using low-cost means and free customizable tools such as ECOSONS and gvSIG (that is compared with industry standard ArcGIS). Echosounder data was filtered, resampled and, interpolated (using kriging or radial basis functions). Moreover, the presented methodology includes two data correction processes: Monte Carlo simulation, used to reduce GPS errors, and manually applied bathymetric line transformations, both improving the obtained results. As an example, we present the bathymetry of the Ría de Cedeira (Galicia, NW Spain), a good testbed area for coastal bathymetry methodologies given its extension and rich topography. The statistical analysis, performed by direct ground-truthing, rendered an upper bound of 1.7 m error, at 95% confidence level, and 0.7 m r.m.s. (cross-validation provided 30 cm and 25 cm, respectively). The methodology presented is fast and easy to implement, accurate outside transects (accuracy can be estimated), and can be used as a low-cost periodical monitoring method.

  13. A Streamflow Statistics (StreamStats) Web Application for Ohio

    USGS Publications Warehouse

    Koltun, G.F.; Kula, Stephanie P.; Puskas, Barry M.

    2006-01-01

    A StreamStats Web application was developed for Ohio that implements equations for estimating a variety of streamflow statistics including the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year peak streamflows, mean annual streamflow, mean monthly streamflows, harmonic mean streamflow, and 25th-, 50th-, and 75th-percentile streamflows. StreamStats is a Web-based geographic information system application designed to facilitate the estimation of streamflow statistics at ungaged locations on streams. StreamStats can also serve precomputed streamflow statistics determined from streamflow-gaging station data. The basic structure, use, and limitations of StreamStats are described in this report. To facilitate the level of automation required for Ohio's StreamStats application, the technique used by Koltun (2003)1 for computing main-channel slope was replaced with a new computationally robust technique. The new channel-slope characteristic, referred to as SL10-85, differed from the National Hydrography Data based channel slope values (SL) reported by Koltun (2003)1 by an average of -28.3 percent, with the median change being -13.2 percent. In spite of the differences, the two slope measures are strongly correlated. The change in channel slope values resulting from the change in computational method necessitated revision of the full-model equations for flood-peak discharges originally presented by Koltun (2003)1. Average standard errors of prediction for the revised full-model equations presented in this report increased by a small amount over those reported by Koltun (2003)1, with increases ranging from 0.7 to 0.9 percent. Mean percentage changes in the revised regression and weighted flood-frequency estimates relative to regression and weighted estimates reported by Koltun (2003)1 were small, ranging from -0.72 to -0.25 percent and -0.22 to 0.07 percent, respectively.

  14. Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

    PubMed Central

    Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

    2014-01-01

    A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941

  15. Comparison of Astigmatic Correction after Femtosecond Lenticule Extraction and Small-Incision Lenticule Extraction for Myopic Astigmatism

    PubMed Central

    Kobashi, Hidenaga; Kamiya, Kazutaka; Ali, Mohamed A.; Igarashi, Akihito; Elewa, Mohamed Ehab M.; Shimizu, Kimiya

    2015-01-01

    Purpose To compare postoperative astigmatic correction between femtosecond lenticule extraction (FLEx) and small-incision lenticule extraction (SMILE) in eyes with myopic astigmatism. Methods We examined 26 eyes of 26 patients undergoing FLEx and 26 eyes of 26 patients undergoing SMILE to correct myopic astigmatism (manifest astigmatism of 1 diopter (D) or more). Visual acuity, cylindrical refraction, the predictability of the astigmatic correction, and the astigmatic vector components using Alpin’s method, were compared between the two groups 3 months postoperatively. Results We found no statistically significant difference in manifest cylindrical refraction (p=0.74) or in the percentage of eyes within ± 0.50 D of their refraction (p=0.47) after the two surgical procedures. Moreover, no statistically significant difference was detected between the groups in astigmatic vector components, namely, surgically induced astigmatism (0.80), target induced astigmatism (p=0.87), astigmatic correction index (p=0.77), angle of error (p=0.24), difference vector (p=0.76), index of success (p=0.91), flattening effect (p=0.79), and flattening index (p=0.84). Conclusions Both FLEx and SMILE procedures are essentially equivalent in correcting myopic astigmatism using vector analysis, suggesting that the lifting or non-lifting of the flap does not significantly affect astigmatic outcomes after these surgical procedures. PMID:25849381

  16. Dependence of the compensation error on the error of a sensor and corrector in an adaptive optics phase-conjugating system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kiyko, V V; Kislov, V I; Ofitserov, E N

    2015-08-31

    In the framework of a statistical model of an adaptive optics system (AOS) of phase conjugation, three algorithms based on an integrated mathematical approach are considered, each of them intended for minimisation of one of the following characteristics: the sensor error (in the case of an ideal corrector), the corrector error (in the case of ideal measurements) and the compensation error (with regard to discreteness and measurement noises and to incompleteness of a system of response functions of the corrector actuators). Functional and statistical relationships between the algorithms are studied and a relation is derived to ensure calculation of themore » mean-square compensation error as a function of the errors of the sensor and corrector with an accuracy better than 10%. Because in adjusting the AOS parameters, it is reasonable to proceed from the equality of the sensor and corrector errors, in the case the Hartmann sensor is used as a wavefront sensor, the required number of actuators in the absence of the noise component in the sensor error turns out 1.5 – 2.5 times less than the number of counts, and that difference grows with increasing measurement noise. (adaptive optics)« less

  17. Atmospheric microwave refractivity and refraction

    NASA Technical Reports Server (NTRS)

    Yu, E.; Hodge, D. B.

    1980-01-01

    The atmospheric refractivity can be expressed as a function of temperature, pressure, water vapor content, and operating frequency. Based on twenty-year meteorological data, statistics of the atmospheric refractivity were obtained. These statistics were used to estimate the variation of dispersion, attenuation, and refraction effects on microwave and millimeter wave signals propagating along atmospheric paths. Bending angle, elevation angle error, and range error were also developed for an exponentially tapered, spherical atmosphere.

  18. Assessing the Predictability of Convection using Ensemble Data Assimilation of Simulated Radar Observations in an LETKF system

    NASA Astrophysics Data System (ADS)

    Lange, Heiner; Craig, George

    2014-05-01

    This study uses the Local Ensemble Transform Kalman Filter (LETKF) to perform storm-scale Data Assimilation of simulated Doppler radar observations into the non-hydrostatic, convection-permitting COSMO model. In perfect model experiments (OSSEs), it is investigated how the limited predictability of convective storms affects precipitation forecasts. The study compares a fine analysis scheme with small RMS errors to a coarse scheme that allows for errors in position, shape and occurrence of storms in the ensemble. The coarse scheme uses superobservations, a coarser grid for analysis weights, a larger localization radius and larger observation error that allow a broadening of the Gaussian error statistics. Three hour forecasts of convective systems (with typical lifetimes exceeding 6 hours) from the detailed analyses of the fine scheme are found to be advantageous to those of the coarse scheme during the first 1-2 hours, with respect to the predicted storm positions. After 3 hours in the convective regime used here, the forecast quality of the two schemes appears indiscernible, judging by RMSE and verification methods for rain-fields and objects. It is concluded that, for operational assimilation systems, the analysis scheme might not necessarily need to be detailed to the grid scale of the model. Depending on the forecast lead time, and on the presence of orographic or synoptic forcing that enhance the predictability of storm occurrences, analyses from a coarser scheme might suffice.

  19. Pocket guide to transportation, 1999

    DOT National Transportation Integrated Search

    1998-12-01

    Statistics published in this Pocket Guide to Transportation come from many different sources. Some statistics are based on samples and are subject to sampling variability. Statistics may also be subject to omissions and errors in reporting, recording...

  20. Pocket guide to transportation, 2009

    DOT National Transportation Integrated Search

    2009-01-01

    Statistics published in this Pocket Guide to Transportation come from many different sources. Some statistics are based on samples and are subject to sampling variability. Statistics may also be subject to omissions and errors in reporting, recording...

  1. Pocket guide to transportation, 2013.

    DOT National Transportation Integrated Search

    2013-01-01

    Abstract Statistics published in this Pocket Guide to Transportation come from many different sources. Some statistics are based on samples and are subject to sampling variability. Statistics may also be subject to omissions and errors in reporting, ...

  2. Pocket guide to transportation, 2010

    DOT National Transportation Integrated Search

    2010-01-01

    Statistics published in this Pocket Guide to Transportation come from many different sources. Some statistics are based on samples and are subject to sampling variability. Statistics may also be subject to omissions and errors in reporting, recording...

  3. Assessment of the knowledge and attitudes of intern doctors to medication prescribing errors in a Nigeria tertiary hospital

    PubMed Central

    Ajemigbitse, Adetutu A.; Omole, Moses Kayode; Ezike, Nnamdi Chika; Erhun, Wilson O.

    2013-01-01

    Context: Junior doctors are reported to make most of the prescribing errors in the hospital setting. Aims: The aim of the following study is to determine the knowledge intern doctors have about prescribing errors and circumstances contributing to making them. Settings and Design: A structured questionnaire was distributed to intern doctors in National Hospital Abuja Nigeria. Subjects and Methods: Respondents gave information about their experience with prescribing medicines, the extent to which they agreed with the definition of a clinically meaningful prescribing error and events that constituted such. Their experience with prescribing certain categories of medicines was also sought. Statistical Analysis Used: Data was analyzed with Statistical Package for the Social Sciences (SPSS) software version 17 (SPSS Inc Chicago, Ill, USA). Chi-squared analysis contrasted differences in proportions; P < 0.05 was considered to be statistically significant. Results: The response rate was 90.9% and 27 (90%) had <1 year of prescribing experience. 17 (56.7%) respondents totally agreed with the definition of a clinically meaningful prescribing error. Most common reasons for prescribing mistakes were a failure to check prescriptions with a reference source (14, 25.5%) and failure to check for adverse drug interactions (14, 25.5%). Omitting some essential information such as duration of therapy (13, 20%), patient age (14, 21.5%) and dosage errors (14, 21.5%) were the most common types of prescribing errors made. Respondents considered workload (23, 76.7%), multitasking (19, 63.3%), rushing (18, 60.0%) and tiredness/stress (16, 53.3%) as important factors contributing to prescribing errors. Interns were least confident prescribing antibiotics (12, 25.5%), opioid analgesics (12, 25.5%) cytotoxics (10, 21.3%) and antipsychotics (9, 19.1%) unsupervised. Conclusions: Respondents seemed to have a low awareness of making prescribing errors. Principles of rational prescribing and events that constitute prescribing errors should be taught in the practice setting. PMID:24808682

  4. Constraining the noise-free distribution of halo spin parameters

    NASA Astrophysics Data System (ADS)

    Benson, Andrew J.

    2017-11-01

    Any measurement made using an N-body simulation is subject to noise due to the finite number of particles used to sample the dark matter distribution function, and the lack of structure below the simulation resolution. This noise can be particularly significant when attempting to measure intrinsically small quantities, such as halo spin. In this work, we develop a model to describe the effects of particle noise on halo spin parameters. This model is calibrated using N-body simulations in which the particle noise can be treated as a Poisson process on the underlying dark matter distribution function, and we demonstrate that this calibrated model reproduces measurements of halo spin parameter error distributions previously measured in N-body convergence studies. Utilizing this model, along with previous measurements of the distribution of halo spin parameters in N-body simulations, we place constraints on the noise-free distribution of halo spins. We find that the noise-free median spin is 3 per cent lower than that measured directly from the N-body simulation, corresponding to a shift of approximately 40 times the statistical uncertainty in this measurement arising purely from halo counting statistics. We also show that measurement of the spin of an individual halo to 10 per cent precision requires at least 4 × 104 particles in the halo - for haloes containing 200 particles, the fractional error on spins measured for individual haloes is of order unity. N-body simulations should be viewed as the results of a statistical experiment applied to a model of dark matter structure formation. When viewed in this way, it is clear that determination of any quantity from such a simulation should be made through forward modelling of the effects of particle noise.

  5. Non-precautionary aspects of toxicology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grandjean, Philippe

    2005-09-01

    Empirical studies in toxicology aim at deciphering complex causal relationships, especially in regard to human disease etiologies. Several scientific traditions limit the usefulness of documentation from current toxicological research, in regard to decision-making based on the precautionary principle. Among non-precautionary aspects of toxicology are the focus on simplified model systems and the effects of single hazards, one by one. Thus, less attention is paid to sources of variability and uncertainty, including individual susceptibility, impacts of mixed and variable exposures, susceptible life-stages, and vulnerable communities. In emphasizing the need for confirmatory evidence, toxicology tends to penalize false positives more than falsemore » negatives. An important source of uncertainty is measurement error that results in misclassification, especially in regard to exposure assessment. Standard statistical analysis assumes that the exposure is measured without error, and imprecisions will usually result in an underestimation of the dose-effect relationship. In testing whether an effect could be considered a possible result of natural variability, a 5% limit for 'statistical significance' is usually applied, even though it may rule out many findings of causal associations, simply because the study was too small (and thus lacked statistical power) or because some imprecision or limited sensitivity of the parameters precluded a more definitive observation. These limitations may be aggravated when toxicology is influenced by vested interests. Because current toxicology overlooks the important goal of achieving a better characterization of uncertainties and their implications, research approaches should be revised and strengthened to counteract the innate ideological biases, thereby supporting our confidence in using toxicology as a main source of documentation and in using the precautionary principle as a decision procedure in the public policy arena.« less

  6. Identifying significant gene‐environment interactions using a combination of screening testing and hierarchical false discovery rate control

    PubMed Central

    Shen, Li; Saykin, Andrew J.; Williams, Scott M.; Moore, Jason H.

    2016-01-01

    ABSTRACT Although gene‐environment (G× E) interactions play an important role in many biological systems, detecting these interactions within genome‐wide data can be challenging due to the loss in statistical power incurred by multiple hypothesis correction. To address the challenge of poor power and the limitations of existing multistage methods, we recently developed a screening‐testing approach for G× E interaction detection that combines elastic net penalized regression with joint estimation to support a single omnibus test for the presence of G× E interactions. In our original work on this technique, however, we did not assess type I error control or power and evaluated the method using just a single, small bladder cancer data set. In this paper, we extend the original method in two important directions and provide a more rigorous performance evaluation. First, we introduce a hierarchical false discovery rate approach to formally assess the significance of individual G× E interactions. Second, to support the analysis of truly genome‐wide data sets, we incorporate a score statistic‐based prescreening step to reduce the number of single nucleotide polymorphisms prior to fitting the first stage penalized regression model. To assess the statistical properties of our method, we compare the type I error rate and statistical power of our approach with competing techniques using both simple simulation designs as well as designs based on real disease architectures. Finally, we demonstrate the ability of our approach to identify biologically plausible SNP‐education interactions relative to Alzheimer's disease status using genome‐wide association study data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). PMID:27578615

  7. Analysis of basic clustering algorithms for numerical estimation of statistical averages in biomolecules.

    PubMed

    Anandakrishnan, Ramu; Onufriev, Alexey

    2008-03-01

    In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.

  8. Is a shift from research on individual medical error to research on health information technology underway? A 40-year analysis of publication trends in medical journals.

    PubMed

    Erlewein, Daniel; Bruni, Tommaso; Gadebusch Bondio, Mariacarla

    2018-06-07

    In 1983, McIntyre and Popper underscored the need for more openness in dealing with errors in medicine. Since then, much has been written on individual medical errors. Furthermore, at the beginning of the 21st century, researchers and medical practitioners increasingly approached individual medical errors through health information technology. Hence, the question arises whether the attention of biomedical researchers shifted from individual medical errors to health information technology. We ran a study to determine publication trends concerning individual medical errors and health information technology in medical journals over the last 40 years. We used the Medical Subject Headings (MeSH) taxonomy in the database MEDLINE. Each year, we analyzed the percentage of relevant publications to the total number of publications in MEDLINE. The trends identified were tested for statistical significance. Our analysis showed that the percentage of publications dealing with individual medical errors increased from 1976 until the beginning of the 21st century but began to drop in 2003. Both the upward and the downward trends were statistically significant (P < 0.001). A breakdown by country revealed that it was the weight of the US and British publications that determined the overall downward trend after 2003. On the other hand, the percentage of publications dealing with health information technology doubled between 2003 and 2015. The upward trend was statistically significant (P < 0.001). The identified trends suggest that the attention of biomedical researchers partially shifted from individual medical errors to health information technology in the USA and the UK. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.

  9. Statistics of the residual refraction errors in laser ranging data

    NASA Technical Reports Server (NTRS)

    Gardner, C. S.

    1977-01-01

    A theoretical model for the range error covariance was derived by assuming that the residual refraction errors are due entirely to errors in the meteorological data which are used to calculate the atmospheric correction. The properties of the covariance function are illustrated by evaluating the theoretical model for the special case of a dense network of weather stations uniformly distributed within a circle.

  10. Demand Forecasting: An Evaluation of DODs Accuracy Metric and Navys Procedures

    DTIC Science & Technology

    2016-06-01

    inventory management improvement plan, mean of absolute scaled error, lead time adjusted squared error, forecast accuracy, benchmarking, naïve method...Manager JASA Journal of the American Statistical Association LASE Lead-time Adjusted Squared Error LCI Life Cycle Indicator MA Moving Average MAE...Mean Squared Error xvi NAVSUP Naval Supply Systems Command NDAA National Defense Authorization Act NIIN National Individual Identification Number

  11. Power of tests for comparing trend curves with application to national immunization survey (NIS).

    PubMed

    Zhao, Zhen

    2011-02-28

    To develop statistical tests for comparing trend curves of study outcomes between two socio-demographic strata across consecutive time points, and compare statistical power of the proposed tests under different trend curves data, three statistical tests were proposed. For large sample size with independent normal assumption among strata and across consecutive time points, the Z and Chi-square test statistics were developed, which are functions of outcome estimates and the standard errors at each of the study time points for the two strata. For small sample size with independent normal assumption, the F-test statistic was generated, which is a function of sample size of the two strata and estimated parameters across study period. If two trend curves are approximately parallel, the power of Z-test is consistently higher than that of both Chi-square and F-test. If two trend curves cross at low interaction, the power of Z-test is higher than or equal to the power of both Chi-square and F-test; however, at high interaction, the powers of Chi-square and F-test are higher than that of Z-test. The measurement of interaction of two trend curves was defined. These tests were applied to the comparison of trend curves of vaccination coverage estimates of standard vaccine series with National Immunization Survey (NIS) 2000-2007 data. Copyright © 2011 John Wiley & Sons, Ltd.

  12. Benefits of statistical molecular design, covariance analysis, and reference models in QSAR: a case study on acetylcholinesterase

    NASA Astrophysics Data System (ADS)

    Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna

    2015-03-01

    Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.

  13. Impact and quantification of the sources of error in DNA pooling designs.

    PubMed

    Jawaid, A; Sham, P

    2009-01-01

    The analysis of genome wide variation offers the possibility of unravelling the genes involved in the pathogenesis of disease. Genome wide association studies are also particularly useful for identifying and validating targets for therapeutic intervention as well as for detecting markers for drug efficacy and side effects. The cost of such large-scale genetic association studies may be reduced substantially by the analysis of pooled DNA from multiple individuals. However, experimental errors inherent in pooling studies lead to a potential increase in the false positive rate and a loss in power compared to individual genotyping. Here we quantify various sources of experimental error using empirical data from typical pooling experiments and corresponding individual genotyping counts using two statistical methods. We provide analytical formulas for calculating these different errors in the absence of complete information, such as replicate pool formation, and for adjusting for the errors in the statistical analysis. We demonstrate that DNA pooling has the potential of estimating allele frequencies accurately, and adjusting the pooled allele frequency estimates for differential allelic amplification considerably improves accuracy. Estimates of the components of error show that differential allelic amplification is the most important contributor to the error variance in absolute allele frequency estimation, followed by allele frequency measurement and pool formation errors. Our results emphasise the importance of minimising experimental errors and obtaining correct error estimates in genetic association studies.

  14. Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

    NASA Astrophysics Data System (ADS)

    Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

    2017-11-01

    Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.

  15. The Impact of Subsampling on MODIS Level-3 Statistics of Cloud Optical Thickness and Effective Radius

    NASA Technical Reports Server (NTRS)

    Oreopoulos, Lazaros

    2004-01-01

    The MODIS Level-3 optical thickness and effective radius cloud product is a gridded l deg. x 1 deg. dataset that is derived from aggregation and subsampling at 5 km of 1 km, resolution Level-2 orbital swath data (Level-2 granules). This study examines the impact of the 5 km subsampling on the mean, standard deviation and inhomogeneity parameter statistics of optical thickness and effective radius. The methodology is simple and consists of estimating mean errors for a large collection of Terra and Aqua Level-2 granules by taking the difference of the statistics at the original and subsampled resolutions. It is shown that the Level-3 sampling does not affect the various quantities investigated to the same degree, with second order moments suffering greater subsampling errors, as expected. Mean errors drop dramatically when averages over a sufficient number of regions (e.g., monthly and/or latitudinal averages) are taken, pointing to a dominance of errors that are of random nature. When histograms built from subsampled data with the same binning rules as in the Level-3 dataset are used to reconstruct the quantities of interest, the mean errors do not deteriorate significantly. The results in this paper provide guidance to users of MODIS Level-3 optical thickness and effective radius cloud products on the range of errors due to subsampling they should expect and perhaps account for, in scientific work with this dataset. In general, subsampling errors should not be a serious concern when moderate temporal and/or spatial averaging is performed.

  16. Analysis of variance to assess statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes.

    PubMed

    Makeyev, Oleksandr; Joe, Cody; Lee, Colin; Besio, Walter G

    2017-07-01

    Concentric ring electrodes have shown promise in non-invasive electrophysiological measurement demonstrating their superiority to conventional disc electrodes, in particular, in accuracy of Laplacian estimation. Recently, we have proposed novel variable inter-ring distances concentric ring electrodes. Analytic and finite element method modeling results for linearly increasing distances electrode configurations suggested they may decrease the truncation error resulting in more accurate Laplacian estimates compared to currently used constant inter-ring distances configurations. This study assesses statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes. Full factorial design of analysis of variance was used with one categorical and two numerical factors: the inter-ring distances, the electrode diameter, and the number of concentric rings in the electrode. The response variables were the Relative Error and the Maximum Error of Laplacian estimation computed using a finite element method model for each of the combinations of levels of three factors. Effects of the main factors and their interactions on Relative Error and Maximum Error were assessed and the obtained results suggest that all three factors have statistically significant effects in the model confirming the potential of using inter-ring distances as a means of improving accuracy of Laplacian estimation.

  17. Elastic scattering and particle production in two-prong. pi. /sup -/p interactions at 8 GeV/c

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kitagaki, T.; Tanaka, S.; Yuta, H.

    1982-10-01

    Results of a high-statistics study of elastic scattering and meson resonances produced by ..pi../sup -/p interactions at 8 GeV/c are presented. Large statistics and small systematic errors permit examination of the complete kinematic region. Total differential cross sections are given for rho/sup 0,-/, f/sup 0/, g/sup 0,-/, ..delta../sup + -/, ..delta../sup 0/, and N* resonances. Spin-density matrix elements and Legendre-polynomial moments are given for rho, f, and ..delta.. resonances. The results for rho/sup 0/ and f/sup 0/ resonances are compared with the predictions of a Regge-pole-exchange model. Properties of the above resonances are compared and discussed. In particular, we presentmore » evidence that the rho/sup 0/ and f/sup 0/ production mechanisms are similar. The similarity of the g/sup 0/ t distribution to that of the rho/sup 0/ and f/sup 0/ suggests a common production mechanism for all three resonances.« less

  18. A statistical motion model based on biomechanical simulations for data fusion during image-guided prostate interventions.

    PubMed

    Hu, Yipeng; Morgan, Dominic; Ahmed, Hashim Uddin; Pendsé, Doug; Sahu, Mahua; Allen, Clare; Emberton, Mark; Hawkes, David; Barratt, Dean

    2008-01-01

    A method is described for generating a patient-specific, statistical motion model (SMM) of the prostate gland. Finite element analysis (FEA) is used to simulate the motion of the gland using an ultrasound-based 3D FE model over a range of plausible boundary conditions and soft-tissue properties. By applying principal component analysis to the displacements of the FE mesh node points inside the gland, the simulated deformations are then used as training data to construct the SMM. The SMM is used to both predict the displacement field over the whole gland and constrain a deformable surface registration algorithm, given only a small number of target points on the surface of the deformed gland. Using 3D transrectal ultrasound images of the prostates of five patients, acquired before and after imposing a physical deformation, to evaluate the accuracy of predicted landmark displacements, the mean target registration error was found to be less than 1.9 mm.

  19. Systematic study of error sources in supersonic skin-friction balance measurements

    NASA Technical Reports Server (NTRS)

    Allen, J. M.

    1976-01-01

    An experimental study was performed to investigate potential error sources in data obtained with a self-nulling, moment-measuring, skin-friction balance. The balance was installed in the sidewall of a supersonic wind tunnel, and independent measurements of the three forces contributing to the balance output (skin friction, lip force, and off-center normal force) were made for a range of gap size and element protrusion. The relatively good agreement between the balance data and the sum of these three independently measured forces validated the three-term model used. No advantage to a small gap size was found; in fact, the larger gaps were preferable. Perfect element alignment with the surrounding test surface resulted in very small balance errors. However, if small protrusion errors are unavoidable, no advantage was found in having the element slightly below the surrounding test surface rather than above it.

  20. Fish: A New Computer Program for Friendly Introductory Statistics Help

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Raffle, Holly

    2005-01-01

    All introductory statistics students must master certain basic descriptive statistics, including means, standard deviations and correlations. Students must also gain insight into such complex concepts as the central limit theorem and standard error. This article introduces and describes the Friendly Introductory Statistics Help (FISH) computer…

  1. Incorporating GIS building data and census housing statistics for sub-block-level population estimation

    USGS Publications Warehouse

    Wu, S.-S.; Wang, L.; Qiu, X.

    2008-01-01

    This article presents a deterministic model for sub-block-level population estimation based on the total building volumes derived from geographic information system (GIS) building data and three census block-level housing statistics. To assess the model, we generated artificial blocks by aggregating census block areas and calculating the respective housing statistics. We then applied the model to estimate populations for sub-artificial-block areas and assessed the estimates with census populations of the areas. Our analyses indicate that the average percent error of population estimation for sub-artificial-block areas is comparable to those for sub-census-block areas of the same size relative to associated blocks. The smaller the sub-block-level areas, the higher the population estimation errors. For example, the average percent error for residential areas is approximately 0.11 percent for 100 percent block areas and 35 percent for 5 percent block areas.

  2. Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping.

    PubMed

    Kubica, Aleksander; Beverland, Michael E; Brandão, Fernando; Preskill, John; Svore, Krysta M

    2018-05-04

    Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p_{3DCC}^{(1)}≃1.9% and p_{3DCC}^{(2)}≃27.6%. We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.

  3. Observation of non-classical correlations in sequential measurements of photon polarization

    NASA Astrophysics Data System (ADS)

    Suzuki, Yutaro; Iinuma, Masataka; Hofmann, Holger F.

    2016-10-01

    A sequential measurement of two non-commuting quantum observables results in a joint probability distribution for all output combinations that can be explained in terms of an initial joint quasi-probability of the non-commuting observables, modified by the resolution errors and back-action of the initial measurement. Here, we show that the error statistics of a sequential measurement of photon polarization performed at different measurement strengths can be described consistently by an imaginary correlation between the statistics of resolution and back-action. The experimental setup was designed to realize variable strength measurements with well-controlled imaginary correlation between the statistical errors caused by the initial measurement of diagonal polarizations, followed by a precise measurement of the horizontal/vertical polarization. We perform the experimental characterization of an elliptically polarized input state and show that the same complex joint probability distribution is obtained at any measurement strength.

  4. A fully redundant double difference algorithm for obtaining minimum variance estimates from GPS observations

    NASA Technical Reports Server (NTRS)

    Melbourne, William G.

    1986-01-01

    In double differencing a regression system obtained from concurrent Global Positioning System (GPS) observation sequences, one either undersamples the system to avoid introducing colored measurement statistics, or one fully samples the system incurring the resulting non-diagonal covariance matrix for the differenced measurement errors. A suboptimal estimation result will be obtained in the undersampling case and will also be obtained in the fully sampled case unless the color noise statistics are taken into account. The latter approach requires a least squares weighting matrix derived from inversion of a non-diagonal covariance matrix for the differenced measurement errors instead of inversion of the customary diagonal one associated with white noise processes. Presented is the so-called fully redundant double differencing algorithm for generating a weighted double differenced regression system that yields equivalent estimation results, but features for certain cases a diagonal weighting matrix even though the differenced measurement error statistics are highly colored.

  5. Prediction of transmission distortion for wireless video communication: analysis.

    PubMed

    Chen, Zhifeng; Wu, Dapeng

    2012-03-01

    Transmitting video over wireless is a challenging problem since video may be seriously distorted due to packet errors caused by wireless channels. The capability of predicting transmission distortion (i.e., video distortion caused by packet errors) can assist in designing video encoding and transmission schemes that achieve maximum video quality or minimum end-to-end video distortion. This paper is aimed at deriving formulas for predicting transmission distortion. The contribution of this paper is twofold. First, we identify the governing law that describes how the transmission distortion process evolves over time and analytically derive the transmission distortion formula as a closed-form function of video frame statistics, channel error statistics, and system parameters. Second, we identify, for the first time, two important properties of transmission distortion. The first property is that the clipping noise, which is produced by nonlinear clipping, causes decay of propagated error. The second property is that the correlation between motion-vector concealment error and propagated error is negative and has dominant impact on transmission distortion, compared with other correlations. Due to these two properties and elegant error/distortion decomposition, our formula provides not only more accurate prediction but also lower complexity than the existing methods.

  6. Systematic Error Modeling and Bias Estimation

    PubMed Central

    Zhang, Feihu; Knoll, Alois

    2016-01-01

    This paper analyzes the statistic properties of the systematic error in terms of range and bearing during the transformation process. Furthermore, we rely on a weighted nonlinear least square method to calculate the biases based on the proposed models. The results show the high performance of the proposed approach for error modeling and bias estimation. PMID:27213386

  7. Investigating the Relationship between Conceptual and Procedural Errors in the Domain of Probability Problem-Solving.

    ERIC Educational Resources Information Center

    O'Connell, Ann Aileen

    The relationships among types of errors observed during probability problem solving were studied. Subjects were 50 graduate students in an introductory probability and statistics course. Errors were classified as text comprehension, conceptual, procedural, and arithmetic. Canonical correlation analysis was conducted on the frequencies of specific…

  8. A Unified Approach to Measurement Error and Missing Data: Overview and Applications

    ERIC Educational Resources Information Center

    Blackwell, Matthew; Honaker, James; King, Gary

    2017-01-01

    Although social scientists devote considerable effort to mitigating measurement error during data collection, they often ignore the issue during data analysis. And although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because of implausible assumptions, high levels of model…

  9. Calibration of remotely sensed proportion or area estimates for misclassification error

    Treesearch

    Raymond L. Czaplewski; Glenn P. Catts

    1992-01-01

    Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the...

  10. Compact disk error measurements

    NASA Technical Reports Server (NTRS)

    Howe, D.; Harriman, K.; Tehranchi, B.

    1993-01-01

    The objectives of this project are as follows: provide hardware and software that will perform simple, real-time, high resolution (single-byte) measurement of the error burst and good data gap statistics seen by a photoCD player read channel when recorded CD write-once discs of variable quality (i.e., condition) are being read; extend the above system to enable measurement of the hard decision (i.e., 1-bit error flags) and soft decision (i.e., 2-bit error flags) decoding information that is produced/used by the Cross Interleaved - Reed - Solomon - Code (CIRC) block decoder employed in the photoCD player read channel; construct a model that uses data obtained via the systems described above to produce meaningful estimates of output error rates (due to both uncorrected ECC words and misdecoded ECC words) when a CD disc having specific (measured) error statistics is read (completion date to be determined); and check the hypothesis that current adaptive CIRC block decoders are optimized for pressed (DAD/ROM) CD discs. If warranted, do a conceptual design of an adaptive CIRC decoder that is optimized for write-once CD discs.

  11. The Relation Between Inflation in Type-I and Type-II Error Rate and Population Divergence in Genome-Wide Association Analysis of Multi-Ethnic Populations.

    PubMed

    Derks, E M; Zwinderman, A H; Gamazon, E R

    2017-05-01

    Population divergence impacts the degree of population stratification in Genome Wide Association Studies. We aim to: (i) investigate type-I error rate as a function of population divergence (F ST ) in multi-ethnic (admixed) populations; (ii) evaluate the statistical power and effect size estimates; and (iii) investigate the impact of population stratification on the results of gene-based analyses. Quantitative phenotypes were simulated. Type-I error rate was investigated for Single Nucleotide Polymorphisms (SNPs) with varying levels of F ST between the ancestral European and African populations. Type-II error rate was investigated for a SNP characterized by a high value of F ST . In all tests, genomic MDS components were included to correct for population stratification. Type-I and type-II error rate was adequately controlled in a population that included two distinct ethnic populations but not in admixed samples. Statistical power was reduced in the admixed samples. Gene-based tests showed no residual inflation in type-I error rate.

  12. Particle Filter with Novel Nonlinear Error Model for Miniature Gyroscope-Based Measurement While Drilling Navigation

    PubMed Central

    Li, Tao; Yuan, Gannan; Li, Wang

    2016-01-01

    The derivation of a conventional error model for the miniature gyroscope-based measurement while drilling (MGWD) system is based on the assumption that the errors of attitude are small enough so that the direction cosine matrix (DCM) can be approximated or simplified by the errors of small-angle attitude. However, the simplification of the DCM would introduce errors to the navigation solutions of the MGWD system if the initial alignment cannot provide precise attitude, especially for the low-cost microelectromechanical system (MEMS) sensors operated in harsh multilateral horizontal downhole drilling environments. This paper proposes a novel nonlinear error model (NNEM) by the introduction of the error of DCM, and the NNEM can reduce the propagated errors under large-angle attitude error conditions. The zero velocity and zero position are the reference points and the innovations in the states estimation of particle filter (PF) and Kalman filter (KF). The experimental results illustrate that the performance of PF is better than KF and the PF with NNEM can effectively restrain the errors of system states, especially for the azimuth, velocity, and height in the quasi-stationary condition. PMID:26999130

  13. Particle Filter with Novel Nonlinear Error Model for Miniature Gyroscope-Based Measurement While Drilling Navigation.

    PubMed

    Li, Tao; Yuan, Gannan; Li, Wang

    2016-03-15

    The derivation of a conventional error model for the miniature gyroscope-based measurement while drilling (MGWD) system is based on the assumption that the errors of attitude are small enough so that the direction cosine matrix (DCM) can be approximated or simplified by the errors of small-angle attitude. However, the simplification of the DCM would introduce errors to the navigation solutions of the MGWD system if the initial alignment cannot provide precise attitude, especially for the low-cost microelectromechanical system (MEMS) sensors operated in harsh multilateral horizontal downhole drilling environments. This paper proposes a novel nonlinear error model (NNEM) by the introduction of the error of DCM, and the NNEM can reduce the propagated errors under large-angle attitude error conditions. The zero velocity and zero position are the reference points and the innovations in the states estimation of particle filter (PF) and Kalman filter (KF). The experimental results illustrate that the performance of PF is better than KF and the PF with NNEM can effectively restrain the errors of system states, especially for the azimuth, velocity, and height in the quasi-stationary condition.

  14. Ion beam machining error control and correction for small scale optics.

    PubMed

    Xie, Xuhui; Zhou, Lin; Dai, Yifan; Li, Shengyi

    2011-09-20

    Ion beam figuring (IBF) technology for small scale optical components is discussed. Since the small removal function can be obtained in IBF, it makes computer-controlled optical surfacing technology possible to machine precision centimeter- or millimeter-scale optical components deterministically. Using a small ion beam to machine small optical components, there are some key problems, such as small ion beam positioning on the optical surface, material removal rate, ion beam scanning pitch control on the optical surface, and so on, that must be seriously considered. The main reasons for the problems are that it is more sensitive to the above problems than a big ion beam because of its small beam diameter and lower material ratio. In this paper, we discuss these problems and their influences in machining small optical components in detail. Based on the identification-compensation principle, an iterative machining compensation method is deduced for correcting the positioning error of an ion beam with the material removal rate estimated by a selected optimal scanning pitch. Experiments on ϕ10 mm Zerodur planar and spherical samples are made, and the final surface errors are both smaller than λ/100 measured by a Zygo GPI interferometer.

  15. Comparison of MLC error sensitivity of various commercial devices for VMAT pre-treatment quality assurance.

    PubMed

    Saito, Masahide; Sano, Naoki; Shibata, Yuki; Kuriyama, Kengo; Komiyama, Takafumi; Marino, Kan; Aoki, Shinichi; Ashizawa, Kazunari; Yoshizawa, Kazuya; Onishi, Hiroshi

    2018-05-01

    The purpose of this study was to compare the MLC error sensitivity of various measurement devices for VMAT pre-treatment quality assurance (QA). This study used four QA devices (Scandidos Delta4, PTW 2D-array, iRT systems IQM, and PTW Farmer chamber). Nine retrospective VMAT plans were used and nine MLC error plans were generated for all nine original VMAT plans. The IQM and Farmer chamber were evaluated using the cumulative signal difference between the baseline and error-induced measurements. In addition, to investigate the sensitivity of the Delta4 device and the 2D-array, global gamma analysis (1%/1, 2%/2, and 3%/3 mm), dose difference (1%, 2%, and 3%) were used between the baseline and error-induced measurements. Some deviations of the MLC error sensitivity for the evaluation metrics and MLC error ranges were observed. For the two ionization devices, the sensitivity of the IQM was significantly better than that of the Farmer chamber (P < 0.01) while both devices had good linearly correlation between the cumulative signal difference and the magnitude of MLC errors. The pass rates decreased as the magnitude of the MLC error increased for both Delta4 and 2D-array. However, the small MLC error for small aperture sizes, such as for lung SBRT, could not be detected using the loosest gamma criteria (3%/3 mm). Our results indicate that DD could be more useful than gamma analysis for daily MLC QA, and that a large-area ionization chamber has a greater advantage for detecting systematic MLC error because of the large sensitive volume, while the other devices could not detect this error for some cases with a small range of MLC error. © 2018 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  16. Talar dome detection and its geometric approximation in CT: Sphere, cylinder or bi-truncated cone?

    PubMed

    Huang, Junbin; Liu, He; Wang, Defeng; Griffith, James F; Shi, Lin

    2017-04-01

    The purpose of our study is to give a relatively objective definition of talar dome and its shape approximations to sphere (SPH), cylinder (CLD) and bi-truncated cone (BTC). The "talar dome" is well-defined with the improved Dijkstra's algorithm, considering the Euclidean distance and surface curvature. The geometric similarity between talar dome and ideal shapes, namely SPH, CLD and BTC, is quantified. 50 unilateral CT datasets from 50 subjects with no pathological morphometry of tali were included in the experiments and statistical analyses were carried out based on the approximation error. The similarity between talar dome and BTC was more prominent, with smaller mean, standard deviation, maximum and median of the approximation error (0.36±0.07mm, 0.32±0.06mm, 2.24±0.47mm and 0.28±0.06mm) compare with fitting to SPH and CLD. In addition, there were significant differences between the fitting error of each pair of models in terms of the 4 measurements (p-values<0.05). The linear regression analyses demonstrated high correlation between CLD and BTC approximations (R 2 =0.55 for median, R 2 >0.7 for others). Color maps representing fitting error indicated that fitting error mainly occurred on the marginal regions of talar dome for SPH and CLD fittings, while that of BTC was small for the whole talar dome. The successful restoration of ankle functions in displacement surgery highly depends on the comprehensive understanding of the talus. The talar dome surface could be well-defined in a computational way and compared to SPH and CLD, the talar dome reflects outstanding similarity with BTC. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Bias correction for selecting the minimal-error classifier from many machine learning models.

    PubMed

    Ding, Ying; Tang, Shaowu; Liao, Serena G; Jia, Jia; Oesterreich, Steffi; Lin, Yan; Tseng, George C

    2014-11-15

    Supervised machine learning is commonly applied in genomic research to construct a classifier from the training data that is generalizable to predict independent testing data. When test datasets are not available, cross-validation is commonly used to estimate the error rate. Many machine learning methods are available, and it is well known that no universally best method exists in general. It has been a common practice to apply many machine learning methods and report the method that produces the smallest cross-validation error rate. Theoretically, such a procedure produces a selection bias. Consequently, many clinical studies with moderate sample sizes (e.g. n = 30-60) risk reporting a falsely small cross-validation error rate that could not be validated later in independent cohorts. In this article, we illustrated the probabilistic framework of the problem and explored the statistical and asymptotic properties. We proposed a new bias correction method based on learning curve fitting by inverse power law (IPL) and compared it with three existing methods: nested cross-validation, weighted mean correction and Tibshirani-Tibshirani procedure. All methods were compared in simulation datasets, five moderate size real datasets and two large breast cancer datasets. The result showed that IPL outperforms the other methods in bias correction with smaller variance, and it has an additional advantage to extrapolate error estimates for larger sample sizes, a practical feature to recommend whether more samples should be recruited to improve the classifier and accuracy. An R package 'MLbias' and all source files are publicly available. tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Noise Estimation and Adaptive Encoding for Asymmetric Quantum Error Correcting Codes

    NASA Astrophysics Data System (ADS)

    Florjanczyk, Jan; Brun, Todd; CenterQuantum Information Science; Technology Team

    We present a technique that improves the performance of asymmetric quantum error correcting codes in the presence of biased qubit noise channels. Our study is motivated by considering what useful information can be learned from the statistics of syndrome measurements in stabilizer quantum error correcting codes (QECC). We consider the case of a qubit dephasing channel where the dephasing axis is unknown and time-varying. We are able to estimate the dephasing angle from the statistics of the standard syndrome measurements used in stabilizer QECC's. We use this estimate to rotate the computational basis of the code in such a way that the most likely type of error is covered by the highest distance of the asymmetric code. In particular, we use the [ [ 15 , 1 , 3 ] ] shortened Reed-Muller code which can correct one phase-flip error but up to three bit-flip errors. In our simulations, we tune the computational basis to match the estimated dephasing axis which in turn leads to a decrease in the probability of a phase-flip error. With a sufficiently accurate estimate of the dephasing axis, our memory's effective error is dominated by the much lower probability of four bit-flips. Aro MURI Grant No. W911NF-11-1-0268.

  19. Refractive errors in children and adolescents in Bucaramanga (Colombia).

    PubMed

    Galvis, Virgilio; Tello, Alejandro; Otero, Johanna; Serrano, Andrés A; Gómez, Luz María; Castellanos, Yuly

    2017-01-01

    The aim of this study was to establish the frequency of refractive errors in children and adolescents aged between 8 and 17 years old, living in the metropolitan area of Bucaramanga (Colombia). This study was a secondary analysis of two descriptive cross-sectional studies that applied sociodemographic surveys and assessed visual acuity and refraction. Ametropias were classified as myopic errors, hyperopic errors, and mixed astigmatism. Eyes were considered emmetropic if none of these classifications were made. The data were collated using free software and analyzed with STATA/IC 11.2. One thousand two hundred twenty-eight individuals were included in this study. Girls showed a higher rate of ametropia than boys. Hyperopic refractive errors were present in 23.1% of the subjects, and myopic errors in 11.2%. Only 0.2% of the eyes had high myopia (≤-6.00 D). Mixed astigmatism and anisometropia were uncommon, and myopia frequency increased with age. There were statistically significant steeper keratometric readings in myopic compared to hyperopic eyes. The frequency of refractive errors that we found of 36.7% is moderate compared to the global data. The rates and parameters statistically differed by sex and age groups. Our findings are useful for establishing refractive error rate benchmarks in low-middle-income countries and as a baseline for following their variation by sociodemographic factors.

  20. Frogs Exploit Statistical Regularities in Noisy Acoustic Scenes to Solve Cocktail-Party-like Problems.

    PubMed

    Lee, Norman; Ward, Jessica L; Vélez, Alejandro; Micheyl, Christophe; Bee, Mark A

    2017-03-06

    Noise is a ubiquitous source of errors in all forms of communication [1]. Noise-induced errors in speech communication, for example, make it difficult for humans to converse in noisy social settings, a challenge aptly named the "cocktail party problem" [2]. Many nonhuman animals also communicate acoustically in noisy social groups and thus face biologically analogous problems [3]. However, we know little about how the perceptual systems of receivers are evolutionarily adapted to avoid the costs of noise-induced errors in communication. In this study of Cope's gray treefrog (Hyla chrysoscelis; Hylidae), we investigated whether receivers exploit a potential statistical regularity present in noisy acoustic scenes to reduce errors in signal recognition and discrimination. We developed an anatomical/physiological model of the peripheral auditory system to show that temporal correlation in amplitude fluctuations across the frequency spectrum ("comodulation") [4-6] is a feature of the noise generated by large breeding choruses of sexually advertising males. In four psychophysical experiments, we investigated whether females exploit comodulation in background noise to mitigate noise-induced errors in evolutionarily critical mate-choice decisions. Subjects experienced fewer errors in recognizing conspecific calls and in selecting the calls of high-quality mates in the presence of simulated chorus noise that was comodulated. These data show unequivocally, and for the first time, that exploiting statistical regularities present in noisy acoustic scenes is an important biological strategy for solving cocktail-party-like problems in nonhuman animal communication. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Applied statistics in ecology: common pitfalls and simple solutions

    Treesearch

    E. Ashley Steel; Maureen C. Kennedy; Patrick G. Cunningham; John S. Stanovick

    2013-01-01

    The most common statistical pitfalls in ecological research are those associated with data exploration, the logic of sampling and design, and the interpretation of statistical results. Although one can find published errors in calculations, the majority of statistical pitfalls result from incorrect logic or interpretation despite correct numerical calculations. There...

  2. Simultaneous correction of large low-order and high-order aberrations with a new deformable mirror technology

    NASA Astrophysics Data System (ADS)

    Rooms, F.; Camet, S.; Curis, J. F.

    2010-02-01

    A new technology of deformable mirror will be presented. Based on magnetic actuators, these deformable mirrors feature record strokes (more than +/- 45μm of astigmatism and focus correction) with an optimized temporal behavior. Furthermore, the development has been made in order to have a large density of actuators within a small clear aperture (typically 52 actuators within a diameter of 9.0mm). We will present the key benefits of this technology for vision science: simultaneous correction of low and high order aberrations, AO-SLO image without artifacts due to the membrane vibration, optimized control, etc. Using recent papers published by Doble, Thibos and Miller, we show the performances that can be achieved by various configurations using statistical approach. The typical distribution of wavefront aberrations (both the low order aberration (LOA) and high order aberration (HOA)) have been computed and the correction applied by the mirror. We compare two configurations of deformable mirrors (52 and 97 actuators) and highlight the influence of the number of actuators on the fitting error, the photon noise error and the effective bandwidth of correction.

  3. Automated SEM Modal Analysis Applied to the Diogenites

    NASA Technical Reports Server (NTRS)

    Bowman, L. E.; Spilde, M. N.; Papike, James J.

    1996-01-01

    Analysis of volume proportions of minerals, or modal analysis, is routinely accomplished by point counting on an optical microscope, but the process, particularly on brecciated samples such as the diogenite meteorites, is tedious and prone to error by misidentification of very small fragments, which may make up a significant volume of the sample. Precise volume percentage data can be gathered on a scanning electron microscope (SEM) utilizing digital imaging and an energy dispersive spectrometer (EDS). This form of automated phase analysis reduces error, and at the same time provides more information than could be gathered using simple point counting alone, such as particle morphology statistics and chemical analyses. We have previously studied major, minor, and trace-element chemistry of orthopyroxene from a suite of diogenites. This abstract describes the method applied to determine the modes on this same suite of meteorites and the results of that research. The modal abundances thus determined add additional information on the petrogenesis of the diogenites. In addition, low-abundance phases such as spinels were located for further analysis by this method.

  4. Radio structure effects on the optical and radio representations of the ICRF

    NASA Astrophysics Data System (ADS)

    Andrei, A. H.; da Silva Neto, D. N.; Assafin, M.; Vieira Martins, R.

    Silva Neto et al. (2002) show that comparing the ICRF Ext.1 sources standard radio position (Ma et al. 1998) against their optical counterpart position (Zacharias et al. 1999, Monet et al., 1998), a systematic pattern appears, which depends on the radio structure index (Fey and Charlot, 2000). The optical to radio offsets produce a distribution suggestive of a coincidence of the optical and radio centroids worse for the radio extended than for the radio compact sources. On average, the coincidence between the optical and radio centroids is found 7.9±1.1 mas smaller for the compact than for the extended sources. Such an effect is reasonably large, and certainly much too large to be due to errors on the VLBI radio position. On the other hand, it is too small to be accounted to the errors on the optical position, which moreover should be independent from the radio stucture. Thus, other than a true pattern of centroids non-coincidence, the remaining explanation is of a hazard result. This paper summarizes the several statistical tests used to discard the hazard explanation.

  5. Predictability of Circulation Transitions (Observed and Modeled): Non-diffusive Dynamics, Markov Chains and Error Growth.

    NASA Astrophysics Data System (ADS)

    Straus, D. M.

    2006-12-01

    The transitions between portions of the state space of the large-scale flow is studied from daily wintertime data over the Pacific North America region using the NCEP reanalysis data set (54 winters) and very large suites of hindcasts made with the COLA atmospheric GCM with observed SST (55 members for each of 18 winters). The partition of the large-scale state space is guided by cluster analysis, whose statistical significance and relationship to SST is reviewed (Straus and Molteni, 2004; Straus, Corti and Molteni, 2006). The determination of the global nature of the flow through state space is studied using Markov Chains (Crommelin, 2004). In particular the non-diffusive part of the flow is contrasted in nature (small data sample) and the AGCM (large data sample). The intrinsic error growth associated with different portions of the state space is studied through sets of identical twin AGCM simulations. The goal is to obtain realistic estimates of predictability times for large-scale transitions that should be useful in long-range forecasting.

  6. The Atacama Cosmology Telescope: temperature and gravitational lensing power spectrum measurements from three seasons of data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Das, Sudeep; Louis, Thibaut; Calabrese, Erminia

    2014-04-01

    We present the temperature power spectra of the cosmic microwave background (CMB) derived from the three seasons of data from the Atacama Cosmology Telescope (ACT) at 148 GHz and 218 GHz, as well as the cross-frequency spectrum between the two channels. We detect and correct for contamination due to the Galactic cirrus in our equatorial maps. We present the results of a number of tests for possible systematic error and conclude that any effects are not significant compared to the statistical errors we quote. Where they overlap, we cross-correlate the ACT and the South Pole Telescope (SPT) maps and showmore » they are consistent. The measurements of higher-order peaks in the CMB power spectrum provide an additional test of the ΛCDM cosmological model, and help constrain extensions beyond the standard model. The small angular scale power spectrum also provides constraining power on the Sunyaev-Zel'dovich effects and extragalactic foregrounds. We also present a measurement of the CMB gravitational lensing convergence power spectrum at 4.6σ detection significance.« less

  7. The Atacama Cosmology Telescope: Temperature and Gravitational Lensing Power Spectrum Measurements from Three Seasons of Data

    NASA Technical Reports Server (NTRS)

    Das, Sudeep; Louis, Thibaut; Nolta, Michael R.; Addison, Graeme E.; Battisetti, Elia S.; Bond, J. Richard; Calabrese, Erminia; Crichton, Devin; Devlin, Mark J.; Dicker, Simon; hide

    2014-01-01

    We present the temperature power spectra of the cosmic microwave background (CMB) derived from the three seasons of data from the Atacama Cosmology Telescope (ACT) at 148 GHz and 218 GHz, as well as the cross-frequency spectrum between the two channels. We detect and correct for contamination due to the Galactic cirrus in our equatorial maps. We present the results of a number of tests for possible systematic error and conclude that any effects are not significant compared to the statistical errors we quote. Where they overlap, we cross-correlate the ACT and the South Pole Telescope (SPT) maps and show they are consistent. The measurements of higher-order peaks in the CMB power spectrum provide an additional test of the ?CDM cosmological model, and help constrain extensions beyond the standard model. The small angular scale power spectrum also provides constraining power on the Sunyaev-Zel'dovich effects and extragalactic foregrounds. We also present a measurement of the CMB gravitational lensing convergence power spectrum at 4.6s detection significance.

  8. The velocity and vorticity fields of the turbulent near wake of a circular cylinder

    NASA Technical Reports Server (NTRS)

    Wallace, James; Ong, Lawrence; Moin, Parviz

    1995-01-01

    The purpose of this research is to provide a detailed experimental database of velocity and vorticity statistics in the very near wake (x/d less than 10) of a circular cylinder at Reynolds number of 3900. This study has determined that estimations of the streamwise velocity component in flow fields with large nonzero cross-stream components are not accurate. Similarly, X-wire measurements of the u and v velocity components in flows containing large w are also subject to the errors due to binormal cooling. Using the look-up table (LUT) technique, and by calibrating the X-wire probe used here to include the range of expected angles of attack (+/- 40 deg), accurate X-wire measurements of instantaneous u and v velocity components in the very near wake region of a circular cylinder has been accomplished. The approximate two-dimensionality of the present flow field was verified with four-wire probe measurements, and to some extent the spanwise correlation measurements with the multisensor rake. Hence, binormal cooling errors in the present X-wire measurements are small.

  9. Ghost imaging based on Pearson correlation coefficients

    NASA Astrophysics Data System (ADS)

    Yu, Wen-Kai; Yao, Xu-Ri; Liu, Xue-Feng; Li, Long-Zhen; Zhai, Guang-Jie

    2015-05-01

    Correspondence imaging is a new modality of ghost imaging, which can retrieve a positive/negative image by simple conditional averaging of the reference frames that correspond to relatively large/small values of the total intensity measured at the bucket detector. Here we propose and experimentally demonstrate a more rigorous and general approach in which a ghost image is retrieved by calculating a Pearson correlation coefficient between the bucket detector intensity and the brightness at a given pixel of the reference frames, and at the next pixel, and so on. Furthermore, we theoretically provide a statistical interpretation of these two imaging phenomena, and explain how the error depends on the sample size and what kind of distribution the error obeys. According to our analysis, the image signal-to-noise ratio can be greatly improved and the sampling number reduced by means of our new method. Project supported by the National Key Scientific Instrument and Equipment Development Project of China (Grant No. 2013YQ030595) and the National High Technology Research and Development Program of China (Grant No. 2013AA122902).

  10. Stochastic output error vibration-based damage detection and assessment in structures under earthquake excitation

    NASA Astrophysics Data System (ADS)

    Sakellariou, J. S.; Fassois, S. D.

    2006-11-01

    A stochastic output error (OE) vibration-based methodology for damage detection and assessment (localization and quantification) in structures under earthquake excitation is introduced. The methodology is intended for assessing the state of a structure following potential damage occurrence by exploiting vibration signal measurements produced by low-level earthquake excitations. It is based upon (a) stochastic OE model identification, (b) statistical hypothesis testing procedures for damage detection, and (c) a geometric method (GM) for damage assessment. The methodology's advantages include the effective use of the non-stationary and limited duration earthquake excitation, the handling of stochastic uncertainties, the tackling of the damage localization and quantification subproblems, the use of "small" size, simple and partial (in both the spatial and frequency bandwidth senses) identified OE-type models, and the use of a minimal number of measured vibration signals. Its feasibility and effectiveness are assessed via Monte Carlo experiments employing a simple simulation model of a 6 storey building. It is demonstrated that damage levels of 5% and 20% reduction in a storey's stiffness characteristics may be properly detected and assessed using noise-corrupted vibration signals.

  11. An accurate symplectic calculation of the inboard magnetic footprint from statistical topological noise and field errors in the DIII-D

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Punjabi, Alkesh; Ali, Halima

    2011-02-15

    Any canonical transformation of Hamiltonian equations is symplectic, and any area-preserving transformation in 2D is a symplectomorphism. Based on these, a discrete symplectic map and its continuous symplectic analog are derived for forward magnetic field line trajectories in natural canonical coordinates. The unperturbed axisymmetric Hamiltonian for magnetic field lines is constructed from the experimental data in the DIII-D [J. L. Luxon and L. E. Davis, Fusion Technol. 8, 441 (1985)]. The equilibrium Hamiltonian is a highly accurate, analytic, and realistic representation of the magnetic geometry of the DIII-D. These symplectic mathematical maps are used to calculate the magnetic footprint onmore » the inboard collector plate in the DIII-D. Internal statistical topological noise and field errors are irreducible and ubiquitous in magnetic confinement schemes for fusion. It is important to know the stochasticity and magnetic footprint from noise and error fields. The estimates of the spectrum and mode amplitudes of the spatial topological noise and magnetic errors in the DIII-D are used as magnetic perturbation. The discrete and continuous symplectic maps are used to calculate the magnetic footprint on the inboard collector plate of the DIII-D by inverting the natural coordinates to physical coordinates. The combination of highly accurate equilibrium generating function, natural canonical coordinates, symplecticity, and small step-size together gives a very accurate calculation of magnetic footprint. Radial variation of magnetic perturbation and the response of plasma to perturbation are not included. The inboard footprint from noise and errors are dominated by m=3, n=1 mode. The footprint is in the form of a toroidally winding helical strip. The width of stochastic layer scales as (1/2) power of amplitude. The area of footprint scales as first power of amplitude. The physical parameters such as toroidal angle, length, and poloidal angle covered before striking, and the safety factor all have fractal structure. The average field diffusion near the X-point for lines that strike and that do not strike differs by about three to four orders of magnitude. The magnetic footprint gives the maximal bounds on size and heat flux density on collector plate.« less

  12. The Spatial Distribution of Adult Obesity Prevalence in Denver County, Colorado: An Empirical Bayes Approach to Adjust EHR-Derived Small Area Estimates.

    PubMed

    Tabano, David C; Bol, Kirk; Newcomer, Sophia R; Barrow, Jennifer C; Daley, Matthew F

    2017-12-06

    Measuring obesity prevalence across geographic areas should account for environmental and socioeconomic factors that contribute to spatial autocorrelation, the dependency of values in estimates across neighboring areas, to mitigate the bias in measures and risk of type I errors in hypothesis testing. Dependency among observations across geographic areas violates statistical independence assumptions and may result in biased estimates. Empirical Bayes (EB) estimators reduce the variability of estimates with spatial autocorrelation, which limits the overall mean square-error and controls for sample bias. Using the Colorado Body Mass Index (BMI) Monitoring System, we modeled the spatial autocorrelation of adult (≥ 18 years old) obesity (BMI ≥ 30 kg m 2 ) measurements using patient-level electronic health record data from encounters between January 1, 2009, and December 31, 2011. Obesity prevalence was estimated among census tracts with >=10 observations in Denver County census tracts during the study period. We calculated the Moran's I statistic to test for spatial autocorrelation across census tracts, and mapped crude and EB obesity prevalence across geographic areas. In Denver County, there were 143 census tracts with 10 or more observations, representing a total of 97,710 adults with a valid BMI. The crude obesity prevalence for adults in Denver County was 29.8 percent (95% CI 28.4-31.1%) and ranged from 12.8 to 45.2 percent across individual census tracts. EB obesity prevalence was 30.2 percent (95% CI 28.9-31.5%) and ranged from 15.3 to 44.3 percent across census tracts. Statistical tests using the Moran's I statistic suggest adult obesity prevalence in Denver County was distributed in a non-random pattern. Clusters of EB obesity estimates were highly significant (alpha=0.05) in neighboring census tracts. Concentrations of obesity estimates were primarily in the west and north in Denver County. Statistical tests reveal adult obesity prevalence exhibit spatial autocorrelation in Denver County at the census tract level. EB estimates for obesity prevalence can be used to control for spatial autocorrelation between neighboring census tracts and may produce less biased estimates of obesity prevalence.

  13. Impact of Communication Errors in Radiology on Patient Care, Customer Satisfaction, and Work-Flow Efficiency.

    PubMed

    Siewert, Bettina; Brook, Olga R; Hochman, Mary; Eisenberg, Ronald L

    2016-03-01

    The purpose of this study is to analyze the impact of communication errors on patient care, customer satisfaction, and work-flow efficiency and to identify opportunities for quality improvement. We performed a search of our quality assurance database for communication errors submitted from August 1, 2004, through December 31, 2014. Cases were analyzed regarding the step in the imaging process at which the error occurred (i.e., ordering, scheduling, performance of examination, study interpretation, or result communication). The impact on patient care was graded on a 5-point scale from none (0) to catastrophic (4). The severity of impact between errors in result communication and those that occurred at all other steps was compared. Error evaluation was performed independently by two board-certified radiologists. Statistical analysis was performed using the chi-square test and kappa statistics. Three hundred eighty of 422 cases were included in the study. One hundred ninety-nine of the 380 communication errors (52.4%) occurred at steps other than result communication, including ordering (13.9%; n = 53), scheduling (4.7%; n = 18), performance of examination (30.0%; n = 114), and study interpretation (3.7%; n = 14). Result communication was the single most common step, accounting for 47.6% (181/380) of errors. There was no statistically significant difference in impact severity between errors that occurred during result communication and those that occurred at other times (p = 0.29). In 37.9% of cases (144/380), there was an impact on patient care, including 21 minor impacts (5.5%; result communication, n = 13; all other steps, n = 8), 34 moderate impacts (8.9%; result communication, n = 12; all other steps, n = 22), and 89 major impacts (23.4%; result communication, n = 45; all other steps, n = 44). In 62.1% (236/380) of cases, no impact was noted, but 52.6% (200/380) of cases had the potential for an impact. Among 380 communication errors in a radiology department, 37.9% had a direct impact on patient care, with an additional 52.6% having a potential impact. Most communication errors (52.4%) occurred at steps other than result communication, with similar severity of impact.

  14. The impact of statistical adjustment on conditional standard errors of measurement in the assessment of physician communication skills.

    PubMed

    Raymond, Mark R; Clauser, Brian E; Furman, Gail E

    2010-10-01

    The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary least squares regression to adjust ratings, and then used generalizability theory to evaluate the impact of these adjustments on score reliability and the overall standard error of measurement. In addition, conditional standard errors of measurement were computed for both observed and adjusted scores to determine whether the improvements in measurement precision were uniform across the score distribution. Results indicated that measurement was generally less precise for communication ratings toward the lower end of the score distribution; and the improvement in measurement precision afforded by statistical modeling varied slightly across the score distribution such that the most improvement occurred in the upper-middle range of the score scale. Possible reasons for these patterns in measurement precision are discussed, as are the limitations of the statistical models used for adjusting performance ratings.

  15. Robust Covariate-Adjusted Log-Rank Statistics and Corresponding Sample Size Formula for Recurrent Events Data

    PubMed Central

    Song, Rui; Kosorok, Michael R.; Cai, Jianwen

    2009-01-01

    Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107

  16. Statistical downscaling of precipitation using long short-term memory recurrent neural networks

    NASA Astrophysics Data System (ADS)

    Misra, Saptarshi; Sarkar, Sudeshna; Mitra, Pabitra

    2017-11-01

    Hydrological impacts of global climate change on regional scale are generally assessed by downscaling large-scale climatic variables, simulated by General Circulation Models (GCMs), to regional, small-scale hydrometeorological variables like precipitation, temperature, etc. In this study, we propose a new statistical downscaling model based on Recurrent Neural Network with Long Short-Term Memory which captures the spatio-temporal dependencies in local rainfall. The previous studies have used several other methods such as linear regression, quantile regression, kernel regression, beta regression, and artificial neural networks. Deep neural networks and recurrent neural networks have been shown to be highly promising in modeling complex and highly non-linear relationships between input and output variables in different domains and hence we investigated their performance in the task of statistical downscaling. We have tested this model on two datasets—one on precipitation in Mahanadi basin in India and the second on precipitation in Campbell River basin in Canada. Our autoencoder coupled long short-term memory recurrent neural network model performs the best compared to other existing methods on both the datasets with respect to temporal cross-correlation, mean squared error, and capturing the extremes.

  17. Examining Impulse-Variability Theory and the Speed-Accuracy Trade-Off in Children's Overarm Throwing Performance.

    PubMed

    Molina, Sergio L; Stodden, David F

    2018-04-01

    This study examined variability in throwing speed and spatial error to test the prediction of an inverted-U function (i.e., impulse-variability [IV] theory) and the speed-accuracy trade-off. Forty-five 9- to 11-year-old children were instructed to throw at a specified percentage of maximum speed (45%, 65%, 85%, and 100%) and hit the wall target. Results indicated no statistically significant differences in variable error across the target conditions (p = .72), failing to support the inverted-U hypothesis. Spatial accuracy results indicated no statistically significant differences with mean radial error (p = .18), centroid radial error (p = .13), and bivariate variable error (p = .08) also failing to support the speed-accuracy trade-off in overarm throwing. As neither throwing performance variability nor accuracy changed across percentages of maximum speed in this sample of children as well as in a previous adult sample, current policy and practices of practitioners may need to be reevaluated.

  18. Medication errors of nurses and factors in refusal to report medication errors among nurses in a teaching medical center of iran in 2012.

    PubMed

    Mostafaei, Davoud; Barati Marnani, Ahmad; Mosavi Esfahani, Haleh; Estebsari, Fatemeh; Shahzaidi, Shiva; Jamshidi, Ensiyeh; Aghamiri, Seyed Samad

    2014-10-01

    About one third of unwanted reported medication consequences are due to medication errors, resulting in one-fifth of hospital injuries. The aim of this study was determined formal and informal medication errors of nurses and the level of importance of factors in refusal to report medication errors among nurses. The cross-sectional study was done on the nursing staff of Shohada Tajrish Hospital, Tehran, Iran in 2012. The data was gathered through a questionnaire, made by the researchers. The questionnaires' face and content validity was confirmed by experts and for measuring its reliability test-retest was used. The data was analyzed by descriptive statistics. We used SPSS for related statistical analyses. The most important factors in refusal to report medication errors respectively were: lack of medication error recording and reporting system in the hospital (3.3%), non-significant error reporting to hospital authorities and lack of appropriate feedback (3.1%), and lack of a clear definition for a medication error (3%). There were both formal and informal reporting of medication errors in this study. Factors pertaining to management in hospitals as well as the fear of the consequences of reporting are two broad fields among the factors that make nurses not report their medication errors. In this regard, providing enough education to nurses, boosting the job security for nurses, management support and revising related processes and definitions are some factors that can help decreasing medication errors and increasing their report in case of occurrence.

  19. Across-cohort QC analyses of GWAS summary statistics from complex traits.

    PubMed

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2016-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.

  20. Across-cohort QC analyses of GWAS summary statistics from complex traits

    PubMed Central

    Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M

    2017-01-01

    Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965

  1. A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging

    PubMed Central

    Logsdon, Benjamin A.; Carty, Cara L.; Reiner, Alexander P.; Dai, James Y.; Kooperberg, Charles

    2012-01-01

    Motivation: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. Results: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. Availability: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html. Contact: blogsdon@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22563072

  2. Static Scene Statistical Non-Uniformity Correction

    DTIC Science & Technology

    2015-03-01

    Error NUC Non-Uniformity Correction RMSE Root Mean Squared Error RSD Relative Standard Deviation S3NUC Static Scene Statistical Non-Uniformity...Deviation ( RSD ) which normalizes the standard deviation, σ, to the mean estimated value, µ using the equation RS D = σ µ × 100. The RSD plot of the gain...estimates is shown in Figure 4.1(b). The RSD plot shows that after a sample size of approximately 10, the different photocount values and the inclusion

  3. Sidereal variations deep underground in Tasmania

    NASA Technical Reports Server (NTRS)

    Humble, J. E.; Fenton, A. G.; Fenton, K. B.

    1985-01-01

    Data from the deep underground vertically directed muon telescopes at Poatina, Tasmania, have been used since 1972 for a number of investigations, including the daily intensity variations, atmospheric influences, and checking for possible effects due to the interplanetary magnetic field. These telescopes have a total sensitive area of only 3 square meters, with the result that the counting rate is low (about 1680 events per hour) and the statistical errors on the results are rather large. Consequently, it was decided several years ago to construct larger detectors for this station. The first of these telescopes has been in operation for two complete years, and the results from it are presented. Results from the new, more stable equipment at Poatina appear to confirm the existence of a first harmonic in the daily variations in sidereal time reported earlier, and are consistent with small or non-existent first harmonics in solar and anti-sidereal time. All the second harmonics appear to be small, if not zero at these energies.

  4. An analysis of the AVE-SESAME I period using statistical structure and correlation functions. [Atmospheric Variability Experiment-Severe Environmental Storm and Mesoscale Experiment

    NASA Technical Reports Server (NTRS)

    Fuelberg, H. E.; Meyer, P. J.

    1984-01-01

    Structure and correlation functions are used to describe atmospheric variability during the 10-11 April day of AVE-SESAME 1979 that coincided with the Red River Valley tornado outbreak. The special mesoscale rawinsonde data are employed in calculations involving temperature, geopotential height, horizontal wind speed and mixing ratio. Functional analyses are performed in both the lower and upper troposphere for the composite 24 h experiment period and at individual 3 h observation times. Results show that mesoscale features are prominent during the composite period. Fields of mixing ratio and horizontal wind speed exhibit the greatest amounts of small-scale variance, whereas temperature and geopotential height contain the least. Results for the nine individual times show that small-scale variance is greatest during the convective outbreak. The functions also are used to estimate random errors in the rawinsonde data. Finally, sensitivity analyses are presented to quantify confidence limits of the structure functions.

  5. Precision, Reliability, and Effect Size of Slope Variance in Latent Growth Curve Models: Implications for Statistical Power Analysis

    PubMed Central

    Brandmaier, Andreas M.; von Oertzen, Timo; Ghisletta, Paolo; Lindenberger, Ulman; Hertzog, Christopher

    2018-01-01

    Latent Growth Curve Models (LGCM) have become a standard technique to model change over time. Prediction and explanation of inter-individual differences in change are major goals in lifespan research. The major determinants of statistical power to detect individual differences in change are the magnitude of true inter-individual differences in linear change (LGCM slope variance), design precision, alpha level, and sample size. Here, we show that design precision can be expressed as the inverse of effective error. Effective error is determined by instrument reliability and the temporal arrangement of measurement occasions. However, it also depends on another central LGCM component, the variance of the latent intercept and its covariance with the latent slope. We derive a new reliability index for LGCM slope variance—effective curve reliability (ECR)—by scaling slope variance against effective error. ECR is interpretable as a standardized effect size index. We demonstrate how effective error, ECR, and statistical power for a likelihood ratio test of zero slope variance formally relate to each other and how they function as indices of statistical power. We also provide a computational approach to derive ECR for arbitrary intercept-slope covariance. With practical use cases, we argue for the complementary utility of the proposed indices of a study's sensitivity to detect slope variance when making a priori longitudinal design decisions or communicating study designs. PMID:29755377

  6. Analysis of the Hessian for Inverse Scattering Problems. Part 3. Inverse Medium Scattering of Electromagnetic Waves in Three Dimensions

    DTIC Science & Technology

    2012-08-01

    small data noise and model error, the discrete Hessian can be approximated by a low-rank matrix. This in turn enables fast solution of an appropriately...implication of the compactness of the Hessian is that for small data noise and model error, the discrete Hessian can be approximated by a low-rank matrix. This...probability distribution is given by the inverse of the Hessian of the negative log likelihood function. For Gaussian data noise and model error, this

  7. Automated Hypothesis Tests and Standard Errors for Nonstandard Problems with Description of Computer Package: A Draft.

    ERIC Educational Resources Information Center

    Lord, Frederic M.; Stocking, Martha

    A general Computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptotic standard error formulas. The program assumes that the observations have a multinormal…

  8. Integrating models that depend on variable data

    NASA Astrophysics Data System (ADS)

    Banks, A. T.; Hill, M. C.

    2016-12-01

    Models of human-Earth systems are often developed with the goal of predicting the behavior of one or more dependent variables from multiple independent variables, processes, and parameters. Often dependent variable values range over many orders of magnitude, which complicates evaluation of the fit of the dependent variable values to observations. Many metrics and optimization methods have been proposed to address dependent variable variability, with little consensus being achieved. In this work, we evaluate two such methods: log transformation (based on the dependent variable being log-normally distributed with a constant variance) and error-based weighting (based on a multi-normal distribution with variances that tend to increase as the dependent variable value increases). Error-based weighting has the advantage of encouraging model users to carefully consider data errors, such as measurement and epistemic errors, while log-transformations can be a black box for typical users. Placing the log-transformation into the statistical perspective of error-based weighting has not formerly been considered, to the best of our knowledge. To make the evaluation as clear and reproducible as possible, we use multiple linear regression (MLR). Simulations are conducted with MatLab. The example represents stream transport of nitrogen with up to eight independent variables. The single dependent variable in our example has values that range over 4 orders of magnitude. Results are applicable to any problem for which individual or multiple data types produce a large range of dependent variable values. For this problem, the log transformation produced good model fit, while some formulations of error-based weighting worked poorly. Results support previous suggestions fthat error-based weighting derived from a constant coefficient of variation overemphasizes low values and degrades model fit to high values. Applying larger weights to the high values is inconsistent with the log-transformation. Greater consistency is obtained by imposing smaller (by up to a factor of 1/35) weights on the smaller dependent-variable values. From an error-based perspective, the small weights are consistent with large standard deviations. This work considers the consequences of these two common ways of addressing variable data.

  9. Center-to-Limb Variation of Deprojection Errors in SDO/HMI Vector Magnetograms

    NASA Astrophysics Data System (ADS)

    Falconer, David; Moore, Ronald; Barghouty, Nasser; Tiwari, Sanjiv K.; Khazanov, Igor

    2015-04-01

    For use in investigating the magnetic causes of coronal heating in active regions and for use in forecasting an active region’s productivity of major CME/flare eruptions, we have evaluated various sunspot-active-region magnetic measures (e.g., total magnetic flux, free-magnetic-energy proxies, magnetic twist measures) from HMI Active Region Patches (HARPs) after the HARP has been deprojected to disk center. From a few tens of thousand HARP vector magnetograms (of a few hundred sunspot active regions) that have been deprojected to disk center, we have determined that the errors in the whole-HARP magnetic measures from deprojection are negligibly small for HARPS deprojected from distances out to 45 heliocentric degrees. For some purposes the errors from deprojection are tolerable out to 60 degrees. We obtained this result by the following process. For each whole-HARP magnetic measure: 1) for each HARP disk passage, normalize the measured values by the measured value for that HARP at central meridian; 2) then for each 0.05 Rs annulus, average the values from all the HARPs in the annulus. This results in an average normalized value as a function of radius for each measure. Assuming no deprojection errors and that, among a large set of HARPs, the measure is as likely to decrease as to increase with HARP distance from disk center, the average of each annulus is expected to be unity, and, for a statistically large sample, the amount of deviation of the average from unity estimates the error from deprojection effects. The deprojection errors arise from 1) errors in the transverse field being deprojected into the vertical field for HARPs observed at large distances from disk center, 2) increasingly larger foreshortening at larger distances from disk center, and 3) possible errors in transverse-field-direction ambiguity resolution.From the compiled set of measured vales of whole-HARP magnetic nonpotentiality parameters measured from deprojected HARPs, we have examined the relation between each nonpotentiality parameter and the speed of CMEs from the measured active regions. For several different nonpotentiality parameters we find there is an upper limit to the CME speed, the limit increasing as the value of the parameter increases.

  10. Generalized Background Error covariance matrix model (GEN_BE v2.0)

    NASA Astrophysics Data System (ADS)

    Descombes, G.; Auligné, T.; Vandenberghe, F.; Barker, D. M.

    2014-07-01

    The specification of state background error statistics is a key component of data assimilation since it affects the impact observations will have on the analysis. In the variational data assimilation approach, applied in geophysical sciences, the dimensions of the background error covariance matrix (B) are usually too large to be explicitly determined and B needs to be modeled. Recent efforts to include new variables in the analysis such as cloud parameters and chemical species have required the development of the code to GENerate the Background Errors (GEN_BE) version 2.0 for the Weather Research and Forecasting (WRF) community model to allow for a simpler, flexible, robust, and community-oriented framework that gathers methods used by meteorological operational centers and researchers. We present the advantages of this new design for the data assimilation community by performing benchmarks and showing some of the new features on data assimilation test cases. As data assimilation for clouds remains a challenge, we present a multivariate approach that includes hydrometeors in the control variables and new correlated errors. In addition, the GEN_BE v2.0 code is employed to diagnose error parameter statistics for chemical species, which shows that it is a tool flexible enough to involve new control variables. While the generation of the background errors statistics code has been first developed for atmospheric research, the new version (GEN_BE v2.0) can be easily extended to other domains of science and be chosen as a testbed for diagnostic and new modeling of B. Initially developed for variational data assimilation, the model of the B matrix may be useful for variational ensemble hybrid methods as well.

  11. Generalized background error covariance matrix model (GEN_BE v2.0)

    NASA Astrophysics Data System (ADS)

    Descombes, G.; Auligné, T.; Vandenberghe, F.; Barker, D. M.; Barré, J.

    2015-03-01

    The specification of state background error statistics is a key component of data assimilation since it affects the impact observations will have on the analysis. In the variational data assimilation approach, applied in geophysical sciences, the dimensions of the background error covariance matrix (B) are usually too large to be explicitly determined and B needs to be modeled. Recent efforts to include new variables in the analysis such as cloud parameters and chemical species have required the development of the code to GENerate the Background Errors (GEN_BE) version 2.0 for the Weather Research and Forecasting (WRF) community model. GEN_BE allows for a simpler, flexible, robust, and community-oriented framework that gathers methods used by some meteorological operational centers and researchers. We present the advantages of this new design for the data assimilation community by performing benchmarks of different modeling of B and showing some of the new features in data assimilation test cases. As data assimilation for clouds remains a challenge, we present a multivariate approach that includes hydrometeors in the control variables and new correlated errors. In addition, the GEN_BE v2.0 code is employed to diagnose error parameter statistics for chemical species, which shows that it is a tool flexible enough to implement new control variables. While the generation of the background errors statistics code was first developed for atmospheric research, the new version (GEN_BE v2.0) can be easily applied to other domains of science and chosen to diagnose and model B. Initially developed for variational data assimilation, the model of the B matrix may be useful for variational ensemble hybrid methods as well.

  12. Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood

    PubMed Central

    Bondell, Howard D.; Stefanski, Leonard A.

    2013-01-01

    Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805

  13. Hybrid Gibbs Sampling and MCMC for CMB Analysis at Small Angular Scales

    NASA Technical Reports Server (NTRS)

    Jewell, Jeffrey B.; Eriksen, H. K.; Wandelt, B. D.; Gorski, K. M.; Huey, G.; O'Dwyer, I. J.; Dickinson, C.; Banday, A. J.; Lawrence, C. R.

    2008-01-01

    A) Gibbs Sampling has now been validated as an efficient, statistically exact, and practically useful method for "low-L" (as demonstrated on WMAP temperature polarization data). B) We are extending Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters for the entire range of angular scales relevant for Planck. C) Made possible by inclusion of foreground model parameters in Gibbs sampling and hybrid MCMC and Gibbs sampling for the low signal to noise (high-L) regime. D) Future items to be included in the Bayesian framework include: 1) Integration with Hybrid Likelihood (or posterior) code for cosmological parameters; 2) Include other uncertainties in instrumental systematics? (I.e. beam uncertainties, noise estimation, calibration errors, other).

  14. 14 CFR 298.63 - Reporting of aircraft operating expenses and related statistics by small certificated air carriers.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... and related statistics by small certificated air carriers. 298.63 Section 298.63 Aeronautics and Space... aircraft operating expenses and related statistics by small certificated air carriers. (a) Each small... Related Statistics.” This schedule shall be filed quarterly as prescribed in § 298.60. Data reported on...

  15. 14 CFR 298.63 - Reporting of aircraft operating expenses and related statistics by small certificated air carriers.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... and related statistics by small certificated air carriers. 298.63 Section 298.63 Aeronautics and Space... aircraft operating expenses and related statistics by small certificated air carriers. (a) Each small... Related Statistics.” This schedule shall be filed quarterly as prescribed in § 298.60. Data reported on...

  16. An Error-Entropy Minimization Algorithm for Tracking Control of Nonlinear Stochastic Systems with Non-Gaussian Variables

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Yunlong; Wang, Aiping; Guo, Lei

    This paper presents an error-entropy minimization tracking control algorithm for a class of dynamic stochastic system. The system is represented by a set of time-varying discrete nonlinear equations with non-Gaussian stochastic input, where the statistical properties of stochastic input are unknown. By using Parzen windowing with Gaussian kernel to estimate the probability densities of errors, recursive algorithms are then proposed to design the controller such that the tracking error can be minimized. The performance of the error-entropy minimization criterion is compared with the mean-square-error minimization in the simulation results.

  17. Implementation of an experimental program to investigate the performance characteristics of OMEGA navigation

    NASA Technical Reports Server (NTRS)

    Baxa, E. G., Jr.

    1974-01-01

    A theoretical formulation of differential and composite OMEGA error is presented to establish hypotheses about the functional relationships between various parameters and OMEGA navigational errors. Computer software developed to provide for extensive statistical analysis of the phase data is described. Results from the regression analysis used to conduct parameter sensitivity studies on differential OMEGA error tend to validate the theoretically based hypothesis concerning the relationship between uncorrected differential OMEGA error and receiver separation range and azimuth. Limited results of measurement of receiver repeatability error and line of position measurement error are also presented.

  18. Bootstrap Methods: A Very Leisurely Look.

    ERIC Educational Resources Information Center

    Hinkle, Dennis E.; Winstead, Wayland H.

    The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…

  19. The crossing statistic: dealing with unknown errors in the dispersion of Type Ia supernovae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shafieloo, Arman; Clifton, Timothy; Ferreira, Pedro, E-mail: arman@ewha.ac.kr, E-mail: tclifton@astro.ox.ac.uk, E-mail: p.ferreira1@physics.ox.ac.uk

    2011-08-01

    We propose a new statistic that has been designed to be used in situations where the intrinsic dispersion of a data set is not well known: The Crossing Statistic. This statistic is in general less sensitive than χ{sup 2} to the intrinsic dispersion of the data, and hence allows us to make progress in distinguishing between different models using goodness of fit to the data even when the errors involved are poorly understood. The proposed statistic makes use of the shape and trends of a model's predictions in a quantifiable manner. It is applicable to a variety of circumstances, althoughmore » we consider it to be especially well suited to the task of distinguishing between different cosmological models using type Ia supernovae. We show that this statistic can easily distinguish between different models in cases where the χ{sup 2} statistic fails. We also show that the last mode of the Crossing Statistic is identical to χ{sup 2}, so that it can be considered as a generalization of χ{sup 2}.« less

  20. Robust Statistical Fusion of Image Labels

    PubMed Central

    Landman, Bennett A.; Asman, Andrew J.; Scoggins, Andrew G.; Bogovic, John A.; Xing, Fangxu; Prince, Jerry L.

    2011-01-01

    Image labeling and parcellation (i.e. assigning structure to a collection of voxels) are critical tasks for the assessment of volumetric and morphometric features in medical imaging data. The process of image labeling is inherently error prone as images are corrupted by noise and artifacts. Even expert interpretations are subject to subjectivity and the precision of the individual raters. Hence, all labels must be considered imperfect with some degree of inherent variability. One may seek multiple independent assessments to both reduce this variability and quantify the degree of uncertainty. Existing techniques have exploited maximum a posteriori statistics to combine data from multiple raters and simultaneously estimate rater reliabilities. Although quite successful, wide-scale application has been hampered by unstable estimation with practical datasets, for example, with label sets with small or thin objects to be labeled or with partial or limited datasets. As well, these approaches have required each rater to generate a complete dataset, which is often impossible given both human foibles and the typical turnover rate of raters in a research or clinical environment. Herein, we propose a robust approach to improve estimation performance with small anatomical structures, allow for missing data, account for repeated label sets, and utilize training/catch trial data. With this approach, numerous raters can label small, overlapping portions of a large dataset, and rater heterogeneity can be robustly controlled while simultaneously estimating a single, reliable label set and characterizing uncertainty. The proposed approach enables many individuals to collaborate in the construction of large datasets for labeling tasks (e.g., human parallel processing) and reduces the otherwise detrimental impact of rater unavailability. PMID:22010145

  1. Patient safety in the clinical laboratory: a longitudinal analysis of specimen identification errors.

    PubMed

    Wagar, Elizabeth A; Tamashiro, Lorraine; Yasin, Bushra; Hilborne, Lee; Bruckner, David A

    2006-11-01

    Patient safety is an increasingly visible and important mission for clinical laboratories. Attention to improving processes related to patient identification and specimen labeling is being paid by accreditation and regulatory organizations because errors in these areas that jeopardize patient safety are common and avoidable through improvement in the total testing process. To assess patient identification and specimen labeling improvement after multiple implementation projects using longitudinal statistical tools. Specimen errors were categorized by a multidisciplinary health care team. Patient identification errors were grouped into 3 categories: (1) specimen/requisition mismatch, (2) unlabeled specimens, and (3) mislabeled specimens. Specimens with these types of identification errors were compared preimplementation and postimplementation for 3 patient safety projects: (1) reorganization of phlebotomy (4 months); (2) introduction of an electronic event reporting system (10 months); and (3) activation of an automated processing system (14 months) for a 24-month period, using trend analysis and Student t test statistics. Of 16,632 total specimen errors, mislabeled specimens, requisition mismatches, and unlabeled specimens represented 1.0%, 6.3%, and 4.6% of errors, respectively. Student t test showed a significant decrease in the most serious error, mislabeled specimens (P < .001) when compared to before implementation of the 3 patient safety projects. Trend analysis demonstrated decreases in all 3 error types for 26 months. Applying performance-improvement strategies that focus longitudinally on specimen labeling errors can significantly reduce errors, therefore improving patient safety. This is an important area in which laboratory professionals, working in interdisciplinary teams, can improve safety and outcomes of care.

  2. Are volcanic seismic b-values high, and if so when?

    NASA Astrophysics Data System (ADS)

    Roberts, Nick S.; Bell, Andrew F.; Main, Ian G.

    2015-12-01

    The Gutenberg-Richter exponent b is a measure of the relative proportion of large and small earthquakes. It is commonly used to infer material properties such as heterogeneity, or mechanical properties such as the state of stress from earthquake populations. It is 'well known' that the b-value tends to be high or very high for volcanic earthquake populations relative to b = 1 for those of tectonic earthquakes, and that b varies significantly with time during periods of unrest. We first review the supporting evidence from 34 case studies, and identify weaknesses in this argument due predominantly to small sample size, the narrow bandwidth of magnitude scales available, variability in the methods used to assess the minimum or cutoff magnitude Mc, and to infer b. Informed by this, we use synthetic realisations to quantify the effect of choice of the cutoff magnitude on maximum likelihood estimates of b, and suggest a new work flow for this choice. We present the first quantitative estimate of the error in b introduced by uncertainties in estimating Mc, as a function of the number of events and the b-value itself. This error can significantly exceed the commonly-quoted statistical error in the estimated b-value, especially for the case that the underlying b-value is high. We apply the new methods to data sets from recent periods of unrest in El Hierro and Mount Etna. For El Hierro we confirm significantly high b-values of 1.5-2.5 prior to the 10 October 2011 eruption. For Mount Etna the b-values are indistinguishable from b = 1 within error, except during the flank eruptions at Mount Etna in 2001-2003, when 1.5 < b < 2.0. For the time period analysed, they are rarely lower than b = 1. Our results confirm that these volcano-tectonic earthquake populations can have systematically high b-values, especially when associated with eruptions. At other times they can be indistinguishable from those of tectonic earthquakes within the total error. The results have significant implications for operational forecasting informed by b-value variability, in particular in assessing the significance of b-value variations identified by sample sizes with fewer than 200 events above the completeness threshold.

  3. Maximum Likelihood Time-of-Arrival Estimation of Optical Pulses via Photon-Counting Photodetectors

    NASA Technical Reports Server (NTRS)

    Erkmen, Baris I.; Moision, Bruce E.

    2010-01-01

    Many optical imaging, ranging, and communications systems rely on the estimation of the arrival time of an optical pulse. Recently, such systems have been increasingly employing photon-counting photodetector technology, which changes the statistics of the observed photocurrent. This requires time-of-arrival estimators to be developed and their performances characterized. The statistics of the output of an ideal photodetector, which are well modeled as a Poisson point process, were considered. An analytical model was developed for the mean-square error of the maximum likelihood (ML) estimator, demonstrating two phenomena that cause deviations from the minimum achievable error at low signal power. An approximation was derived to the threshold at which the ML estimator essentially fails to provide better than a random guess of the pulse arrival time. Comparing the analytic model performance predictions to those obtained via simulations, it was verified that the model accurately predicts the ML performance over all regimes considered. There is little prior art that attempts to understand the fundamental limitations to time-of-arrival estimation from Poisson statistics. This work establishes both a simple mathematical description of the error behavior, and the associated physical processes that yield this behavior. Previous work on mean-square error characterization for ML estimators has predominantly focused on additive Gaussian noise. This work demonstrates that the discrete nature of the Poisson noise process leads to a distinctly different error behavior.

  4. Estimations of ABL fluxes and other turbulence parameters from Doppler lidar data

    NASA Technical Reports Server (NTRS)

    Gal-Chen, Tzvi; Xu, Mei; Eberhard, Wynn

    1989-01-01

    Techniques for extraction boundary layer parameters from measurements of a short-pulse CO2 Doppler lidar are described. The measurements are those collected during the First International Satellites Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE). By continuously operating the lidar for about an hour, stable statistics of the radial velocities can be extracted. Assuming that the turbulence is horizontally homogeneous, the mean wind, its standard deviations, and the momentum fluxes were estimated. Spectral analysis of the radial velocities is also performed from which, by examining the amplitude of the power spectrum at the inertial range, the kinetic energy dissipation was deduced. Finally, using the statistical form of the Navier-Stokes equations, the surface heat flux is derived as the residual balance between the vertical gradient of the third moment of the vertical velocity and the kinetic energy dissipation. Combining many measurements would normally reduce the error provided that, it is unbiased and uncorrelated. The nature of some of the algorithms however, is such that, biased and correlated errors may be generated even though the raw measurements are not. Data processing procedures were developed that eliminate bias and minimize error correlation. Once bias and error correlations are accounted for, the large sample size is shown to reduce the errors substantially. The principal features of the derived turbulence statistics for two case studied are presented.

  5. Model-based error diffusion for high fidelity lenticular screening.

    PubMed

    Lau, Daniel; Smith, Trebor

    2006-04-17

    Digital halftoning is the process of converting a continuous-tone image into an arrangement of black and white dots for binary display devices such as digital ink-jet and electrophotographic printers. As printers are achieving print resolutions exceeding 1,200 dots per inch, it is becoming increasingly important for halftoning algorithms to consider the variations and interactions in the size and shape of printed dots between neighboring pixels. In the case of lenticular screening where statistically independent images are spatially multiplexed together, ignoring these variations and interactions, such as dot overlap, will result in poor lenticular image quality. To this end, we describe our use of model-based error-diffusion for the lenticular screening problem where statistical independence between component images is achieved by restricting the diffusion of error to only those pixels of the same component image where, in order to avoid instabilities, the proposed approach involves a novel error-clipping procedure.

  6. Bayesian truncation errors in chiral effective field theory: model checking and accounting for correlations

    NASA Astrophysics Data System (ADS)

    Melendez, Jordan; Wesolowski, Sarah; Furnstahl, Dick

    2017-09-01

    Chiral effective field theory (EFT) predictions are necessarily truncated at some order in the EFT expansion, which induces an error that must be quantified for robust statistical comparisons to experiment. A Bayesian model yields posterior probability distribution functions for these errors based on expectations of naturalness encoded in Bayesian priors and the observed order-by-order convergence pattern of the EFT. As a general example of a statistical approach to truncation errors, the model was applied to chiral EFT for neutron-proton scattering using various semi-local potentials of Epelbaum, Krebs, and Meißner (EKM). Here we discuss how our model can learn correlation information from the data and how to perform Bayesian model checking to validate that the EFT is working as advertised. Supported in part by NSF PHY-1614460 and DOE NUCLEI SciDAC DE-SC0008533.

  7. Asteroid orbital error analysis: Theory and application

    NASA Technical Reports Server (NTRS)

    Muinonen, K.; Bowell, Edward

    1992-01-01

    We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).

  8. Joint Seasonal ARMA Approach for Modeling of Load Forecast Errors in Planning Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hafen, Ryan P.; Samaan, Nader A.; Makarov, Yuri V.

    2014-04-14

    To make informed and robust decisions in the probabilistic power system operation and planning process, it is critical to conduct multiple simulations of the generated combinations of wind and load parameters and their forecast errors to handle the variability and uncertainty of these time series. In order for the simulation results to be trustworthy, the simulated series must preserve the salient statistical characteristics of the real series. In this paper, we analyze day-ahead load forecast error data from multiple balancing authority locations and characterize statistical properties such as mean, standard deviation, autocorrelation, correlation between series, time-of-day bias, and time-of-day autocorrelation.more » We then construct and validate a seasonal autoregressive moving average (ARMA) model to model these characteristics, and use the model to jointly simulate day-ahead load forecast error series for all BAs.« less

  9. Analytic score distributions for a spatially continuous tridirectional Monte Carol transport problem

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Booth, T.E.

    1996-01-01

    The interpretation of the statistical error estimates produced by Monte Carlo transport codes is still somewhat of an art. Empirically, there are variance reduction techniques whose error estimates are almost always reliable, and there are variance reduction techniques whose error estimates are often unreliable. Unreliable error estimates usually result from inadequate large-score sampling from the score distribution`s tail. Statisticians believe that more accurate confidence interval statements are possible if the general nature of the score distribution can be characterized. Here, the analytic score distribution for the exponential transform applied to a simple, spatially continuous Monte Carlo transport problem is provided.more » Anisotropic scattering and implicit capture are included in the theory. In large part, the analytic score distributions that are derived provide the basis for the ten new statistical quality checks in MCNP.« less

  10. A multi-object statistical atlas adaptive for deformable registration errors in anomalous medical image segmentation

    NASA Astrophysics Data System (ADS)

    Botter Martins, Samuel; Vallin Spina, Thiago; Yasuda, Clarissa; Falcão, Alexandre X.

    2017-02-01

    Statistical Atlases have played an important role towards automated medical image segmentation. However, a challenge has been to make the atlas more adaptable to possible errors in deformable registration of anomalous images, given that the body structures of interest for segmentation might present significant differences in shape and texture. Recently, deformable registration errors have been accounted by a method that locally translates the statistical atlas over the test image, after registration, and evaluates candidate objects from a delineation algorithm in order to choose the best one as final segmentation. In this paper, we improve its delineation algorithm and extend the model to be a multi-object statistical atlas, built from control images and adaptable to anomalous images, by incorporating a texture classifier. In order to provide a first proof of concept, we instantiate the new method for segmenting, object-by-object and all objects simultaneously, the left and right brain hemispheres, and the cerebellum, without the brainstem, and evaluate it on MRT1-images of epilepsy patients before and after brain surgery, which removed portions of the temporal lobe. The results show efficiency gain with statistically significant higher accuracy, using the mean Average Symmetric Surface Distance, with respect to the original approach.

  11. A model and variance reduction method for computing statistical outputs of stochastic elliptic partial differential equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vidal-Codina, F., E-mail: fvidal@mit.edu; Nguyen, N.C., E-mail: cuongng@mit.edu; Giles, M.B., E-mail: mike.giles@maths.ox.ac.uk

    We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basismore » approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.« less

  12. Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants.

    PubMed

    Sauzet, Odile; Peacock, Janet L

    2017-07-20

    The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins) and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.

  13. A predictability study of Lorenz's 28-variable model as a dynamical system

    NASA Technical Reports Server (NTRS)

    Krishnamurthy, V.

    1993-01-01

    The dynamics of error growth in a two-layer nonlinear quasi-geostrophic model has been studied to gain an understanding of the mathematical theory of atmospheric predictability. The growth of random errors of varying initial magnitudes has been studied, and the relation between this classical approach and the concepts of the nonlinear dynamical systems theory has been explored. The local and global growths of random errors have been expressed partly in terms of the properties of an error ellipsoid and the Liapunov exponents determined by linear error dynamics. The local growth of small errors is initially governed by several modes of the evolving error ellipsoid but soon becomes dominated by the longest axis. The average global growth of small errors is exponential with a growth rate consistent with the largest Liapunov exponent. The duration of the exponential growth phase depends on the initial magnitude of the errors. The subsequent large errors undergo a nonlinear growth with a steadily decreasing growth rate and attain saturation that defines the limit of predictability. The degree of chaos and the largest Liapunov exponent show considerable variation with change in the forcing, which implies that the time variation in the external forcing can introduce variable character to the predictability.

  14. Usefulness of Guided Breathing for Dose Rate-Regulated Tracking

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Han-Oh, Sarah; Department of Radiation Oncology, University of Maryland Medical System, Baltimore, MD; Yi, Byong Yong

    2009-02-01

    Purpose: To evaluate the usefulness of guided breathing for dose rate-regulated tracking (DRRT), a new technique to compensate for intrafraction tumor motion. Methods and Materials: DRRT uses a preprogrammed multileaf collimator sequence that tracks the tumor motion derived from four-dimensional computed tomography and the corresponding breathing signals measured before treatment. Because the multileaf collimator speed can be controlled by adjusting the dose rate, the multileaf collimator positions are adjusted in real time during treatment by dose rate regulation, thereby maintaining synchrony with the tumor motion. DRRT treatment was simulated with free, audio-guided, and audiovisual-guided breathing signals acquired from 23 lungmore » cancer patients. The tracking error and duty cycle for each patient were determined as a function of the system time delay (range, 0-1.0 s). Results: The tracking error and duty cycle averaged for all 23 patients was 1.9 {+-} 0.8 mm and 92% {+-} 5%, 1.9 {+-} 1.0 mm and 93% {+-} 6%, and 1.8 {+-} 0.7 mm and 92% {+-} 6% for the free, audio-guided, and audiovisual-guided breathing, respectively, for a time delay of 0.35 s. The small differences in both the tracking error and the duty cycle with guided breathing were not statistically significant. Conclusion: DRRT by its nature adapts well to variations in breathing frequency, which is also the motivation for guided-breathing techniques. Because of this redundancy, guided breathing does not result in significant improvements for either the tracking error or the duty cycle when DRRT is used for real-time tumor tracking.« less

  15. Validation of self-reported start year of mobile phone use in a Swedish case-control study on radiofrequency fields and acoustic neuroma risk.

    PubMed

    Pettersson, David; Bottai, Matteo; Mathiesen, Tiit; Prochazka, Michaela; Feychting, Maria

    2015-01-01

    The possible effect of radiofrequency exposure from mobile phones on tumor risk has been studied since the late 1990s. Yet, empirical information about recall of the start of mobile phone use among adult cases and controls has never been reported. Limited knowledge about recall errors hampers interpretations of the epidemiological evidence. We used network operator data to validate the self-reported start year of mobile phone use in a case-control study of mobile phone use and acoustic neuroma risk. The answers of 96 (29%) cases and 111 (22%) controls could be included in the validation. The larger proportion of cases reflects a more complete and detailed reporting of subscription history. Misclassification was substantial, with large random errors, small systematic errors, and no significant differences between cases and controls. The average difference between self-reported and operator start year was -0.62 (95% confidence interval: -1.42, 0.17) years for cases and -0.71 (-1.50, 0.07) years for controls, standard deviations were 3.92 and 4.17 years, respectively. Agreement between self-reported and operator-recorded data categorized into short, intermediate and long-term use was moderate (kappa statistic: 0.42). Should an association exist, dilution of risk estimates and distortion of exposure-response patterns for time since first mobile phone use could result from the large random errors in self-reported start year. Retrospective collection of operator data likely leads to a selection of "good reporters", with a higher proportion of cases. Thus, differential recall cannot be entirely excluded.

  16. Three-dimensional quantitative structure-activity relationship studies on novel series of benzotriazine based compounds acting as Src inhibitors using CoMFA and CoMSIA.

    PubMed

    Gueto, Carlos; Ruiz, José L; Torres, Juan E; Méndez, Jefferson; Vivas-Reyes, Ricardo

    2008-03-01

    Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on a series of benzotriazine derivatives, as Src inhibitors. Ligand molecular superimposition on the template structure was performed by database alignment method. The statistically significant model was established of 72 molecules, which were validated by a test set of six compounds. The CoMFA model yielded a q(2)=0.526, non cross-validated R(2) of 0.781, F value of 88.132, bootstrapped R(2) of 0.831, standard error of prediction=0.587, and standard error of estimate=0.351 while the CoMSIA model yielded the best predictive model with a q(2)=0.647, non cross-validated R(2) of 0.895, F value of 115.906, bootstrapped R(2) of 0.953, standard error of prediction=0.519, and standard error of estimate=0.178. The contour maps obtained from 3D-QSAR studies were appraised for activity trends for the molecules analyzed. Results indicate that small steric volumes in the hydrophobic region, electron-withdrawing groups next to the aryl linker region, and atoms close to the solvent accessible region increase the Src inhibitory activity of the compounds. In fact, adding substituents at positions 5, 6, and 8 of the benzotriazine nucleus were generated new compounds having a higher predicted activity. The data generated from the present study will further help to design novel, potent, and selective Src inhibitors as anticancer therapeutic agents.

  17. Insight From the Statistics of Nothing: Estimating Limits of Change Detection Using Inferred No-Change Areas in DEM Difference Maps and Application to Landslide Hazard Studies

    NASA Astrophysics Data System (ADS)

    Haneberg, W. C.

    2017-12-01

    Remote characterization of new landslides or areas of ongoing movement using differences in high resolution digital elevation models (DEMs) created through time, for example before and after major rains or earthquakes, is an attractive proposition. In the case of large catastrophic landslides, changes may be apparent enough that simple subtraction suffices. In other cases, statistical noise can obscure landslide signatures and place practical limits on detection. In ideal cases on land, GPS surveys of representative areas at the time of DEM creation can quantify the inherent errors. In less-than-ideal terrestrial cases and virtually all submarine cases, it may be impractical or impossible to independently estimate the DEM errors. Examining DEM difference statistics for areas reasonably inferred to have no change, however, can provide insight into the limits of detectability. Data from inferred no-change areas of airborne LiDAR DEM difference maps of the 2014 Oso, Washington landslide and landslide-prone colluvium slopes along the Ohio River valley in northern Kentucky, show that DEM difference maps can have non-zero mean and slope dependent error components consistent with published studies of DEM errors. Statistical thresholds derived from DEM difference error and slope data can help to distinguish between DEM differences that are likely real—and which may indicate landsliding—from those that are likely spurious or irrelevant. This presentation describes and compares two different approaches, one based upon a heuristic assumption about the proportion of the study area likely covered by new landslides and another based upon the amount of change necessary to ensure difference at a specified level of probability.

  18. Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

    USGS Publications Warehouse

    Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

    2015-09-28

    Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.

  19. Bayesian statistics applied to the location of the source of explosions at Stromboli Volcano, Italy

    USGS Publications Warehouse

    Saccorotti, G.; Chouet, B.; Martini, M.; Scarpa, R.

    1998-01-01

    We present a method for determining the location and spatial extent of the source of explosions at Stromboli Volcano, Italy, based on a Bayesian inversion of the slowness vector derived from frequency-slowness analyses of array data. The method searches for source locations that minimize the error between the expected and observed slowness vectors. For a given set of model parameters, the conditional probability density function of slowness vectors is approximated by a Gaussian distribution of expected errors. The method is tested with synthetics using a five-layer velocity model derived for the north flank of Stromboli and a smoothed velocity model derived from a power-law approximation of the layered structure. Application to data from Stromboli allows for a detailed examination of uncertainties in source location due to experimental errors and incomplete knowledge of the Earth model. Although the solutions are not constrained in the radial direction, excellent resolution is achieved in both transverse and depth directions. Under the assumption that the horizontal extent of the source does not exceed the crater dimension, the 90% confidence region in the estimate of the explosive source location corresponds to a small volume extending from a depth of about 100 m to a maximum depth of about 300 m beneath the active vents, with a maximum likelihood source region located in the 120- to 180-m-depth interval.

  20. Quantitative Assessment of Blood Pressure Measurement Accuracy and Variability from Visual Auscultation Method by Observers without Receiving Medical Training

    PubMed Central

    Feng, Yong; Chen, Aiqing

    2017-01-01

    This study aimed to quantify blood pressure (BP) measurement accuracy and variability with different techniques. Thirty video clips of BP recordings from the BHS training database were converted to Korotkoff sound waveforms. Ten observers without receiving medical training were asked to determine BPs using (a) traditional manual auscultatory method and (b) visual auscultation method by visualizing the Korotkoff sound waveform, which was repeated three times on different days. The measurement error was calculated against the reference answers, and the measurement variability was calculated from the SD of the three repeats. Statistical analysis showed that, in comparison with the auscultatory method, visual method significantly reduced overall variability from 2.2 to 1.1 mmHg for SBP and from 1.9 to 0.9 mmHg for DBP (both p < 0.001). It also showed that BP measurement errors were significant for both techniques (all p < 0.01, except DBP from the traditional method). Although significant, the overall mean errors were small (−1.5 and −1.2 mmHg for SBP and −0.7 and 2.6 mmHg for DBP, resp., from the traditional auscultatory and visual auscultation methods). In conclusion, the visual auscultation method had the ability to achieve an acceptable degree of BP measurement accuracy, with smaller variability in comparison with the traditional auscultatory method. PMID:29423405

  1. Application of advanced shearing techniques to the calibration of autocollimators with small angle generators and investigation of error sources.

    PubMed

    Yandayan, T; Geckeler, R D; Aksulu, M; Akgoz, S A; Ozgur, B

    2016-05-01

    The application of advanced error-separating shearing techniques to the precise calibration of autocollimators with Small Angle Generators (SAGs) was carried out for the first time. The experimental realization was achieved using the High Precision Small Angle Generator (HPSAG) of TUBITAK UME under classical dimensional metrology laboratory environmental conditions. The standard uncertainty value of 5 mas (24.2 nrad) reached by classical calibration method was improved to the level of 1.38 mas (6.7 nrad). Shearing techniques, which offer a unique opportunity to separate the errors of devices without recourse to any external standard, were first adapted by Physikalisch-Technische Bundesanstalt (PTB) to the calibration of autocollimators with angle encoders. It has been demonstrated experimentally in a clean room environment using the primary angle standard of PTB (WMT 220). The application of the technique to a different type of angle measurement system extends the range of the shearing technique further and reveals other advantages. For example, the angular scales of the SAGs are based on linear measurement systems (e.g., capacitive nanosensors for the HPSAG). Therefore, SAGs show different systematic errors when compared to angle encoders. In addition to the error-separation of HPSAG and the autocollimator, detailed investigations on error sources were carried out. Apart from determination of the systematic errors of the capacitive sensor used in the HPSAG, it was also demonstrated that the shearing method enables the unique opportunity to characterize other error sources such as errors due to temperature drift in long term measurements. This proves that the shearing technique is a very powerful method for investigating angle measuring systems, for their improvement, and for specifying precautions to be taken during the measurements.

  2. Relevant reduction effect with a modified thermoplastic mask of rotational error for glottic cancer in IMRT

    NASA Astrophysics Data System (ADS)

    Jung, Jae Hong; Jung, Joo-Young; Cho, Kwang Hwan; Ryu, Mi Ryeong; Bae, Sun Hyun; Moon, Seong Kwon; Kim, Yong Ho; Choe, Bo-Young; Suh, Tae Suk

    2017-02-01

    The purpose of this study was to analyze the glottis rotational error (GRE) by using a thermoplastic mask for patients with the glottic cancer undergoing intensity-modulated radiation therapy (IMRT). We selected 20 patients with glottic cancer who had received IMRT by using the tomotherapy. The image modalities with both kilovoltage computed tomography (planning kVCT) and megavoltage CT (daily MVCT) images were used for evaluating the error. Six anatomical landmarks in the image were defined to evaluate a correlation between the absolute GRE (°) and the length of contact with the underlying skin of the patient by the mask (mask, mm). We also statistically analyzed the results by using the Pearson's correlation coefficient and a linear regression analysis ( P <0.05). The mask and the absolute GRE were verified to have a statistical correlation ( P < 0.01). We found a statistical significance for each parameter in the linear regression analysis (mask versus absolute roll: P = 0.004 [ P < 0.05]; mask versus 3D-error: P = 0.000 [ P < 0.05]). The range of the 3D-errors with contact by the mask was from 1.2% - 39.7% between the maximumand no-contact case in this study. A thermoplastic mask with a tight, increased contact area may possibly contribute to the uncertainty of the reproducibility as a variation of the absolute GRE. Thus, we suggest that a modified mask, such as one that covers only the glottis area, can significantly reduce the patients' setup errors during the treatment.

  3. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    NASA Astrophysics Data System (ADS)

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    2016-09-01

    This study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Second, using a newly developed proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ˜ 2°, than those from the three empirical models with averaged errors > ˜ 5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. This study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.

  4. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    DOE PAGES

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    2016-09-21

    Our study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Furthermore, using a newly developedmore » proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ~2°, than those from the three empirical models with averaged errors > ~5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. Finally, this study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.« less

  5. Selecting Statistical Quality Control Procedures for Limiting the Impact of Increases in Analytical Random Error on Patient Safety.

    PubMed

    Yago, Martín

    2017-05-01

    QC planning based on risk management concepts can reduce the probability of harming patients due to an undetected out-of-control error condition. It does this by selecting appropriate QC procedures to decrease the number of erroneous results reported. The selection can be easily made by using published nomograms for simple QC rules when the out-of-control condition results in increased systematic error. However, increases in random error also occur frequently and are difficult to detect, which can result in erroneously reported patient results. A statistical model was used to construct charts for the 1 ks and X /χ 2 rules. The charts relate the increase in the number of unacceptable patient results reported due to an increase in random error with the capability of the measurement procedure. They thus allow for QC planning based on the risk of patient harm due to the reporting of erroneous results. 1 ks Rules are simple, all-around rules. Their ability to deal with increases in within-run imprecision is minimally affected by the possible presence of significant, stable, between-run imprecision. X /χ 2 rules perform better when the number of controls analyzed during each QC event is increased to improve QC performance. Using nomograms simplifies the selection of statistical QC procedures to limit the number of erroneous patient results reported due to an increase in analytical random error. The selection largely depends on the presence or absence of stable between-run imprecision. © 2017 American Association for Clinical Chemistry.

  6. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    Our study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Furthermore, using a newly developedmore » proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ~2°, than those from the three empirical models with averaged errors > ~5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. Finally, this study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.« less

  7. The Statistical Power of Planned Comparisons.

    ERIC Educational Resources Information Center

    Benton, Roberta L.

    Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…

  8. Application of Statistics in Engineering Technology Programs

    ERIC Educational Resources Information Center

    Zhan, Wei; Fink, Rainer; Fang, Alex

    2010-01-01

    Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. Traditionally, however, statistics is not extensively used in undergraduate engineering technology (ET) programs, resulting in a major disconnect from industry…

  9. Catastrophic photometric redshift errors: Weak-lensing survey requirements

    DOE PAGES

    Bernstein, Gary; Huterer, Dragan

    2010-01-11

    We study the sensitivity of weak lensing surveys to the effects of catastrophic redshift errors - cases where the true redshift is misestimated by a significant amount. To compute the biases in cosmological parameters, we adopt an efficient linearized analysis where the redshift errors are directly related to shifts in the weak lensing convergence power spectra. We estimate the number N spec of unbiased spectroscopic redshifts needed to determine the catastrophic error rate well enough that biases in cosmological parameters are below statistical errors of weak lensing tomography. While the straightforward estimate of N spec is ~10 6 we findmore » that using only the photometric redshifts with z ≤ 2.5 leads to a drastic reduction in N spec to ~ 30,000 while negligibly increasing statistical errors in dark energy parameters. Therefore, the size of spectroscopic survey needed to control catastrophic errors is similar to that previously deemed necessary to constrain the core of the z s – z p distribution. We also study the efficacy of the recent proposal to measure redshift errors by cross-correlation between the photo-z and spectroscopic samples. We find that this method requires ~ 10% a priori knowledge of the bias and stochasticity of the outlier population, and is also easily confounded by lensing magnification bias. In conclusion, the cross-correlation method is therefore unlikely to supplant the need for a complete spectroscopic redshift survey of the source population.« less

  10. Self-calibration method without joint iteration for distributed small satellite SAR systems

    NASA Astrophysics Data System (ADS)

    Xu, Qing; Liao, Guisheng; Liu, Aifei; Zhang, Juan

    2013-12-01

    The performance of distributed small satellite synthetic aperture radar systems degrades significantly due to the unavoidable array errors, including gain, phase, and position errors, in real operating scenarios. In the conventional method proposed in (IEEE T Aero. Elec. Sys. 42:436-451, 2006), the spectrum components within one Doppler bin are considered as calibration sources. However, it is found in this article that the gain error estimation and the position error estimation in the conventional method can interact with each other. The conventional method may converge to suboptimal solutions in large position errors since it requires the joint iteration between gain-phase error estimation and position error estimation. In addition, it is also found that phase errors can be estimated well regardless of position errors when the zero Doppler bin is chosen. In this article, we propose a method obtained by modifying the conventional one, based on these two observations. In this modified method, gain errors are firstly estimated and compensated, which eliminates the interaction between gain error estimation and position error estimation. Then, by using the zero Doppler bin data, the phase error estimation can be performed well independent of position errors. Finally, position errors are estimated based on the Taylor-series expansion. Meanwhile, the joint iteration between gain-phase error estimation and position error estimation is not required. Therefore, the problem of suboptimal convergence, which occurs in the conventional method, can be avoided with low computational method. The modified method has merits of faster convergence and lower estimation error compared to the conventional one. Theoretical analysis and computer simulation results verified the effectiveness of the modified method.

  11. 75 FR 26780 - State Median Income Estimate for a Four-Person Family: Notice of the Federal Fiscal Year (FFY...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-05-12

    ... Household Economic Statistics Division at (301) 763-3243. Under the advice of the Census Bureau, HHS..., which consists of the error that arises from the use of probability sampling to create the sample. For...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the...

  12. Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

    ERIC Educational Resources Information Center

    Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

    2013-01-01

    In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

  13. The Hurst Phenomenon in Error Estimates Related to Atmospheric Turbulence

    NASA Astrophysics Data System (ADS)

    Dias, Nelson Luís; Crivellaro, Bianca Luhm; Chamecki, Marcelo

    2018-05-01

    The Hurst phenomenon is a well-known feature of long-range persistence first observed in hydrological and geophysical time series by E. Hurst in the 1950s. It has also been found in several cases in turbulence time series measured in the wind tunnel, the atmosphere, and in rivers. Here, we conduct a systematic investigation of the value of the Hurst coefficient H in atmospheric surface-layer data, and its impact on the estimation of random errors. We show that usually H > 0.5 , which implies the non-existence (in the statistical sense) of the integral time scale. Since the integral time scale is present in the Lumley-Panofsky equation for the estimation of random errors, this has important practical consequences. We estimated H in two principal ways: (1) with an extension of the recently proposed filtering method to estimate the random error (H_p ), and (2) with the classical rescaled range introduced by Hurst (H_R ). Other estimators were tried but were found less able to capture the statistical behaviour of the large scales of turbulence. Using data from three micrometeorological campaigns we found that both first- and second-order turbulence statistics display the Hurst phenomenon. Usually, H_R is larger than H_p for the same dataset, raising the question that one, or even both, of these estimators, may be biased. For the relative error, we found that the errors estimated with the approach adopted by us, that we call the relaxed filtering method, and that takes into account the occurrence of the Hurst phenomenon, are larger than both the filtering method and the classical Lumley-Panofsky estimates. Finally, we found that there is no apparent relationship between H and the Obukhov stability parameter. The relative errors, however, do show stability dependence, particularly in the case of the error of the kinematic momentum flux in unstable conditions, and that of the kinematic sensible heat flux in stable conditions.

  14. Statistical approaches to account for false-positive errors in environmental DNA samples.

    PubMed

    Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid

    2016-05-01

    Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies. © 2015 John Wiley & Sons Ltd.

  15. Impact of large-scale tides on cosmological distortions via redshift-space power spectrum

    NASA Astrophysics Data System (ADS)

    Akitsu, Kazuyuki; Takada, Masahiro

    2018-03-01

    Although large-scale perturbations beyond a finite-volume survey region are not direct observables, these affect measurements of clustering statistics of small-scale (subsurvey) perturbations in large-scale structure, compared with the ensemble average, via the mode-coupling effect. In this paper we show that a large-scale tide induced by scalar perturbations causes apparent anisotropic distortions in the redshift-space power spectrum of galaxies in a way depending on an alignment between the tide, wave vector of small-scale modes and line-of-sight direction. Using the perturbation theory of structure formation, we derive a response function of the redshift-space power spectrum to large-scale tide. We then investigate the impact of large-scale tide on estimation of cosmological distances and the redshift-space distortion parameter via the measured redshift-space power spectrum for a hypothetical large-volume survey, based on the Fisher matrix formalism. To do this, we treat the large-scale tide as a signal, rather than an additional source of the statistical errors, and show that a degradation in the parameter is restored if we can employ the prior on the rms amplitude expected for the standard cold dark matter (CDM) model. We also discuss whether the large-scale tide can be constrained at an accuracy better than the CDM prediction, if the effects up to a larger wave number in the nonlinear regime can be included.

  16. Searching for collective behavior in a large network of sensory neurons.

    PubMed

    Tkačik, Gašper; Marre, Olivier; Amodei, Dario; Schneidman, Elad; Bialek, William; Berry, Michael J

    2014-01-01

    Maximum entropy models are the least structured probability distributions that exactly reproduce a chosen set of statistics measured in an interacting network. Here we use this principle to construct probabilistic models which describe the correlated spiking activity of populations of up to 120 neurons in the salamander retina as it responds to natural movies. Already in groups as small as 10 neurons, interactions between spikes can no longer be regarded as small perturbations in an otherwise independent system; for 40 or more neurons pairwise interactions need to be supplemented by a global interaction that controls the distribution of synchrony in the population. Here we show that such "K-pairwise" models--being systematic extensions of the previously used pairwise Ising models--provide an excellent account of the data. We explore the properties of the neural vocabulary by: 1) estimating its entropy, which constrains the population's capacity to represent visual information; 2) classifying activity patterns into a small set of metastable collective modes; 3) showing that the neural codeword ensembles are extremely inhomogenous; 4) demonstrating that the state of individual neurons is highly predictable from the rest of the population, allowing the capacity for error correction.

  17. Effects of turbulence intensity and gravity on transport of inertial particles across a shearless turbulence interface

    NASA Astrophysics Data System (ADS)

    Good, Garrett; Gerashchenko, Sergiy; Warhaft, Zellman

    2010-11-01

    Water droplets of sub-Kolmogorov size are sprayed into the turbulence side of a shearless turbulent-non-turbulent interface (TNI) as well as a turbulent-turbulent interface (TTI). An active grid is used to form the mixing layer and a splitter plate separates the droplet-non droplet interface near the origin. Particle concentration, size and velocity are determined by Phase Doppler Particle Analyzer, the velocity field by hot wires, and the droplet accelerations by particle tracking. As for a passive scalar, for the TTI, the concentration profiles are described by an error function. For the TNI, the concentration profiles fall off more rapidly than for the TTI due to the large-scale intermittency. The profile evolution and effects of initial conditions are discussed, as are the relative importance of the large and small scales in the transport process. It is shown that the concentration statistics are better described in terms of the Stokes number based on the large scales than the small, but some features of the mixing are determined by the small scales, and these will be discussed. Sponsored by the U.S. NSF.

  18. Searching for Collective Behavior in a Large Network of Sensory Neurons

    PubMed Central

    Tkačik, Gašper; Marre, Olivier; Amodei, Dario; Schneidman, Elad; Bialek, William; Berry, Michael J.

    2014-01-01

    Maximum entropy models are the least structured probability distributions that exactly reproduce a chosen set of statistics measured in an interacting network. Here we use this principle to construct probabilistic models which describe the correlated spiking activity of populations of up to 120 neurons in the salamander retina as it responds to natural movies. Already in groups as small as 10 neurons, interactions between spikes can no longer be regarded as small perturbations in an otherwise independent system; for 40 or more neurons pairwise interactions need to be supplemented by a global interaction that controls the distribution of synchrony in the population. Here we show that such “K-pairwise” models—being systematic extensions of the previously used pairwise Ising models—provide an excellent account of the data. We explore the properties of the neural vocabulary by: 1) estimating its entropy, which constrains the population's capacity to represent visual information; 2) classifying activity patterns into a small set of metastable collective modes; 3) showing that the neural codeword ensembles are extremely inhomogenous; 4) demonstrating that the state of individual neurons is highly predictable from the rest of the population, allowing the capacity for error correction. PMID:24391485

  19. High-Precision Tests of Stochastic Thermodynamics in a Feedback Trap

    NASA Astrophysics Data System (ADS)

    Gavrilov, Momčilo; Jun, Yonggun; Bechhoefer, John

    2015-03-01

    Feedback traps can trap and manipulate small particles and molecules in solution. They have been applied to the measurement of physical and chemical properties of particles and to explore fundamental questions in the non-equilibrium statistical mechanics of small systems. Feedback traps allow one to choose an arbitrary virtual potential, do any time-dependent transformation of the potential, and measure various thermodynamic quantities such as stochastic work, heat, or entropy. In feedback-trap experiments, the dynamics of a trapped object is determined by the imposed potential but is also affected by drifts due to electrochemical reactions and by temperature variations in the electronic amplifier. Although such drifts are small for measurements on the order of seconds, they dominate on time scales of minutes or slower. In this talk, we present a recursive algorithm that allows real-time estimations of drifts and other particle properties. These estimates let us do a real-time calibration of the feedback trap. Having eliminated systematic errors, we were able to show that erasing a one-bit memory requires at least kT ln 2 of work, in accordance with Landauer's principle. This work was supported by NSERC (Canada).

  20. Internal pilots for a class of linear mixed models with Gaussian and compound symmetric data

    PubMed Central

    Gurka, Matthew J.; Coffey, Christopher S.; Muller, Keith E.

    2015-01-01

    SUMMARY An internal pilot design uses interim sample size analysis, without interim data analysis, to adjust the final number of observations. The approach helps to choose a sample size sufficiently large (to achieve the statistical power desired), but not too large (which would waste money and time). We report on recent research in cerebral vascular tortuosity (curvature in three dimensions) which would benefit greatly from internal pilots due to uncertainty in the parameters of the covariance matrix used for study planning. Unfortunately, observations correlated across the four regions of the brain and small sample sizes preclude using existing methods. However, as in a wide range of medical imaging studies, tortuosity data have no missing or mistimed data, a factorial within-subject design, the same between-subject design for all responses, and a Gaussian distribution with compound symmetry. For such restricted models, we extend exact, small sample univariate methods for internal pilots to linear mixed models with any between-subject design (not just two groups). Planning a new tortuosity study illustrates how the new methods help to avoid sample sizes that are too small or too large while still controlling the type I error rate. PMID:17318914

  1. A note on the kappa statistic for clustered dichotomous data.

    PubMed

    Zhou, Ming; Yang, Zhao

    2014-06-30

    The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.

  2. First measurements of error fields on W7-X using flux surface mapping

    DOE PAGES

    Lazerson, Samuel A.; Otte, Matthias; Bozhenkov, Sergey; ...

    2016-08-03

    Error fields have been detected and quantified using the flux surface mapping diagnostic system on Wendelstein 7-X (W7-X). A low-field 'more » $${\\rlap{-}\\ \\iota} =1/2$$ ' magnetic configuration ($${\\rlap{-}\\ \\iota} =\\iota /2\\pi $$ ), sensitive to error fields, was developed in order to detect their presence using the flux surface mapping diagnostic. In this configuration, a vacuum flux surface with rotational transform of n/m = 1/2 is created at the mid-radius of the vacuum flux surfaces. If no error fields are present a vanishingly small n/m = 5/10 island chain should be present. Modeling indicates that if an n = 1 perturbing field is applied by the trim coils, a large n/m = 1/2 island chain will be opened. This island chain is used to create a perturbation large enough to be imaged by the diagnostic. Phase and amplitude scans of the applied field allow the measurement of a small $$\\sim 0.04$$ m intrinsic island chain with a $${{130}^{\\circ}}$$ phase relative to the first module of the W7-X experiment. Lastly, these error fields are determined to be small and easily correctable by the trim coil system.« less

  3. Statistically Controlling for Confounding Constructs Is Harder than You Think

    PubMed Central

    Westfall, Jacob; Yarkoni, Tal

    2016-01-01

    Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity. PMID:27031707

  4. What to use to express the variability of data: Standard deviation or standard error of mean?

    PubMed

    Barde, Mohini P; Barde, Prajakt J

    2012-07-01

    Statistics plays a vital role in biomedical research. It helps present data precisely and draws the meaningful conclusions. While presenting data, one should be aware of using adequate statistical measures. In biomedical journals, Standard Error of Mean (SEM) and Standard Deviation (SD) are used interchangeably to express the variability; though they measure different parameters. SEM quantifies uncertainty in estimate of the mean whereas SD indicates dispersion of the data from mean. As readers are generally interested in knowing the variability within sample, descriptive data should be precisely summarized with SD. Use of SEM should be limited to compute CI which measures the precision of population estimate. Journals can avoid such errors by requiring authors to adhere to their guidelines.

  5. Kappa statistic for the clustered dichotomous responses from physicians and patients

    PubMed Central

    Kang, Chaeryon; Qaqish, Bahjat; Monaco, Jane; Sheridan, Stacey L.; Cai, Jianwen

    2013-01-01

    The bootstrap method for estimating the standard error of the kappa statistic in the presence of clustered data is evaluated. Such data arise, for example, in assessing agreement between physicians and their patients regarding their understanding of the physician-patient interaction and discussions. We propose a computationally efficient procedure for generating correlated dichotomous responses for physicians and assigned patients for simulation studies. The simulation result demonstrates that the proposed bootstrap method produces better estimate of the standard error and better coverage performance compared to the asymptotic standard error estimate that ignores dependence among patients within physicians with at least a moderately large number of clusters. An example of an application to a coronary heart disease prevention study is presented. PMID:23533082

  6. Does bad inference drive out good?

    PubMed

    Marozzi, Marco

    2015-07-01

    The (mis)use of statistics in practice is widely debated, and a field where the debate is particularly active is medicine. Many scholars emphasize that a large proportion of published medical research contains statistical errors. It has been noted that top class journals like Nature Medicine and The New England Journal of Medicine publish a considerable proportion of papers that contain statistical errors and poorly document the application of statistical methods. This paper joins the debate on the (mis)use of statistics in the medical literature. Even though the validation process of a statistical result may be quite elusive, a careful assessment of underlying assumptions is central in medicine as well as in other fields where a statistical method is applied. Unfortunately, a careful assessment of underlying assumptions is missing in many papers, including those published in top class journals. In this paper, it is shown that nonparametric methods are good alternatives to parametric methods when the assumptions for the latter ones are not satisfied. A key point to solve the problem of the misuse of statistics in the medical literature is that all journals have their own statisticians to review the statistical method/analysis section in each submitted paper. © 2015 Wiley Publishing Asia Pty Ltd.

  7. Medication administration error reporting and associated factors among nurses working at the University of Gondar referral hospital, Northwest Ethiopia, 2015.

    PubMed

    Bifftu, Berhanu Boru; Dachew, Berihun Assefa; Tiruneh, Bewket Tadesse; Beshah, Debrework Tesgera

    2016-01-01

    Medication administration is the final step/phase of medication process in which its error directly affects the patient health. Due to the central role of nurses in medication administration, whether they are the source of an error, a contributor, or an observer they have the professional, legal and ethical responsibility to recognize and report. The aim of this study was to assess the prevalence of medication administration error reporting and associated factors among nurses working at The University of Gondar Referral Hospital, Northwest Ethiopia. Institution based quantitative cross - sectional study was conducted among 282 Nurses. Data were collected using semi-structured, self-administered questionnaire of the Medication Administration Errors Reporting (MAERs). Binary logistic regression with 95 % confidence interval was used to identify factors associated with medication administration errors reporting. The estimated medication administration error reporting was found to be 29.1 %. The perceived rates of medication administration errors reporting for non-intravenous related medications were ranged from 16.8 to 28.6 % and for intravenous-related from 20.6 to 33.4 %. Education status (AOR =1.38, 95 % CI: 4.009, 11.128), disagreement over time - error definition (AOR = 0.44, 95 % CI: 0.468, 0.990), administrative reason (AOR = 0.35, 95 % CI: 0.168, 0.710) and fear (AOR = 0.39, 95 % CI: 0.257, 0.838) were factors statistically significant for the refusal of reporting medication administration errors at p-value <0.05. In this study, less than one third of the study participants reported medication administration errors. Educational status, disagreement over time - error definition, administrative reason and fear were factors statistically significant for the refusal of errors reporting at p-value <0.05. Therefore, the results of this study suggest strategies that enhance the cultures of error reporting such as providing a clear definition of reportable errors and strengthen the educational status of nurses by the health care organization.

  8. BaTMAn: Bayesian Technique for Multi-image Analysis

    NASA Astrophysics Data System (ADS)

    Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.

    2016-12-01

    Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.

  9. Increasing the statistical significance of entanglement detection in experiments.

    PubMed

    Jungnitsch, Bastian; Niekamp, Sönke; Kleinmann, Matthias; Gühne, Otfried; Lu, He; Gao, Wei-Bo; Chen, Yu-Ao; Chen, Zeng-Bing; Pan, Jian-Wei

    2010-05-28

    Entanglement is often verified by a violation of an inequality like a Bell inequality or an entanglement witness. Considerable effort has been devoted to the optimization of such inequalities in order to obtain a high violation. We demonstrate theoretically and experimentally that such an optimization does not necessarily lead to a better entanglement test, if the statistical error is taken into account. Theoretically, we show for different error models that reducing the violation of an inequality can improve the significance. Experimentally, we observe this phenomenon in a four-photon experiment, testing the Mermin and Ardehali inequality for different levels of noise. Furthermore, we provide a way to develop entanglement tests with high statistical significance.

  10. Comparison of Peak-Flow Estimation Methods for Small Drainage Basins in Maine

    USGS Publications Warehouse

    Hodgkins, Glenn A.; Hebson, Charles; Lombard, Pamela J.; Mann, Alexander

    2007-01-01

    Understanding the accuracy of commonly used methods for estimating peak streamflows is important because the designs of bridges, culverts, and other river structures are based on these flows. Different methods for estimating peak streamflows were analyzed for small drainage basins in Maine. For the smallest basins, with drainage areas of 0.2 to 1.0 square mile, nine peak streamflows from actual rainfall events at four crest-stage gaging stations were modeled by the Rational Method and the Natural Resource Conservation Service TR-20 method and compared to observed peak flows. The Rational Method had a root mean square error (RMSE) of -69.7 to 230 percent (which means that approximately two thirds of the modeled flows were within -69.7 to 230 percent of the observed flows). The TR-20 method had an RMSE of -98.0 to 5,010 percent. Both the Rational Method and TR-20 underestimated the observed flows in most cases. For small basins, with drainage areas of 1.0 to 10 square miles, modeled peak flows were compared to observed statistical peak flows with return periods of 2, 50, and 100 years for 17 streams in Maine and adjoining parts of New Hampshire. Peak flows were modeled by the Rational Method, the Natural Resources Conservation Service TR-20 method, U.S. Geological Survey regression equations, and the Probabilistic Rational Method. The regression equations were the most accurate method of computing peak flows in Maine for streams with drainage areas of 1.0 to 10 square miles with an RMSE of -34.3 to 52.2 percent for 50-year peak flows. The Probabilistic Rational Method was the next most accurate method (-38.5 to 62.6 percent). The Rational Method (-56.1 to 128 percent) and particularly the TR-20 method (-76.4 to 323 percent) had much larger errors. Both the TR-20 and regression methods had similar numbers of underpredictions and overpredictions. The Rational Method overpredicted most peak flows and the Probabilistic Rational Method tended to overpredict peak flows from the smaller (less than 5 square miles) drainage basins and underpredict peak flows from larger drainage basins. The results of this study are consistent with the most comprehensive analysis of observed and modeled peak streamflows in the United States, which analyzed statistical peak flows from 70 drainage basins in the Midwest and the Northwest.

  11. Research Design and Statistical Methods in Indian Medical Journals: A Retrospective Survey

    PubMed Central

    Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S.; Mayya, Shreemathi S.

    2015-01-01

    Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were – study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and interpretation (χ2=25.616, Φ=0.173, p<0.0001), 32.5% (104/320) compared to 17.1% (84/490), though some serious ones were still present. Indian medical research seems to have made no major progress regarding using correct statistical analyses, but error/defects in study designs have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems. PMID:25856194

  12. Research design and statistical methods in Indian medical journals: a retrospective survey.

    PubMed

    Hassan, Shabbeer; Yellur, Rajashree; Subramani, Pooventhan; Adiga, Poornima; Gokhale, Manoj; Iyer, Manasa S; Mayya, Shreemathi S

    2015-01-01

    Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were - study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and interpretation (χ2=25.616, Φ=0.173, p<0.0001), 32.5% (104/320) compared to 17.1% (84/490), though some serious ones were still present. Indian medical research seems to have made no major progress regarding using correct statistical analyses, but error/defects in study designs have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems.

  13. Disclosure of Medical Errors: What Factors Influence How Patients Respond?

    PubMed Central

    Mazor, Kathleen M; Reed, George W; Yood, Robert A; Fischer, Melissa A; Baril, Joann; Gurwitz, Jerry H

    2006-01-01

    BACKGROUND Disclosure of medical errors is encouraged, but research on how patients respond to specific practices is limited. OBJECTIVE This study sought to determine whether full disclosure, an existing positive physician-patient relationship, an offer to waive associated costs, and the severity of the clinical outcome influenced patients' responses to medical errors. PARTICIPANTS Four hundred and seven health plan members participated in a randomized experiment in which they viewed video depictions of medical error and disclosure. DESIGN Subjects were randomly assigned to experimental condition. Conditions varied in type of medication error, level of disclosure, reference to a prior positive physician-patient relationship, an offer to waive costs, and clinical outcome. MEASURES Self-reported likelihood of changing physicians and of seeking legal advice; satisfaction, trust, and emotional response. RESULTS Nondisclosure increased the likelihood of changing physicians, and reduced satisfaction and trust in both error conditions. Nondisclosure increased the likelihood of seeking legal advice and was associated with a more negative emotional response in the missed allergy error condition, but did not have a statistically significant impact on seeking legal advice or emotional response in the monitoring error condition. Neither the existence of a positive relationship nor an offer to waive costs had a statistically significant impact. CONCLUSIONS This study provides evidence that full disclosure is likely to have a positive effect or no effect on how patients respond to medical errors. The clinical outcome also influences patients' responses. The impact of an existing positive physician-patient relationship, or of waiving costs associated with the error remains uncertain. PMID:16808770

  14. Statistics Using Just One Formula

    ERIC Educational Resources Information Center

    Rosenthal, Jeffrey S.

    2018-01-01

    This article advocates that introductory statistics be taught by basing all calculations on a single simple margin-of-error formula and deriving all of the standard introductory statistical concepts (confidence intervals, significance tests, comparisons of means and proportions, etc) from that one formula. It is argued that this approach will…

  15. Mars approach navigation using Doppler and range measurements to surface beacons and orbiting spacecraft

    NASA Technical Reports Server (NTRS)

    Thurman, Sam W.; Estefan, Jeffrey A.

    1991-01-01

    Approximate analytical models are developed and used to construct an error covariance analysis for investigating the range of orbit determination accuracies which might be achieved for typical Mars approach trajectories. The sensitivity or orbit determination accuracy to beacon/orbiter position errors and to small spacecraft force modeling errors is also investigated. The results indicate that the orbit determination performance obtained from both Doppler and range data is a strong function of the inclination of the approach trajectory to the Martian equator, for surface beacons, and for orbiters, the inclination relative to the orbital plane. Large variations in performance were also observed for different approach velocity magnitudes; Doppler data in particular were found to perform poorly in determining the downtrack (along the direction of flight) component of spacecraft position. In addition, it was found that small spacecraft acceleration modeling errors can induce large errors in the Doppler-derived downtrack position estimate.

  16. Potential Mediators in Parenting and Family Intervention: Quality of Mediation Analyses

    PubMed Central

    Patel, Chandni C.; Fairchild, Amanda J.; Prinz, Ronald J.

    2017-01-01

    Parenting and family interventions have repeatedly shown effectiveness in preventing and treating a range of youth outcomes. Accordingly, investigators in this area have conducted a number of studies using statistical mediation to examine some of the potential mechanisms of action by which these interventions work. This review examined from a methodological perspective in what ways and how well the family-based intervention studies tested statistical mediation. A systematic search identified 73 published outcome studies that tested mediation for family-based interventions across a wide range of child and adolescent outcomes (i.e., externalizing, internalizing, and substance-abuse problems; high-risk sexual activity; and academic achievement), for putative mediators pertaining to positive and negative parenting, family functioning, youth beliefs and coping skills, and peer relationships. Taken as a whole, the studies used designs that adequately addressed temporal precedence. The majority of studies used the product of coefficients approach to mediation, which is preferred, and less limiting than the causal steps approach. Statistical significance testing did not always make use of the most recently developed approaches, which would better accommodate small sample sizes and more complex functions. Specific recommendations are offered for future mediation studies in this area with respect to full longitudinal design, mediation approach, significance testing method, documentation and reporting of statistics, testing of multiple mediators, and control for Type I error. PMID:28028654

  17. Quantifying uncertainty in climate change science through empirical information theory.

    PubMed

    Majda, Andrew J; Gershgorin, Boris

    2010-08-24

    Quantifying the uncertainty for the present climate and the predictions of climate change in the suite of imperfect Atmosphere Ocean Science (AOS) computer models is a central issue in climate change science. Here, a systematic approach to these issues with firm mathematical underpinning is developed through empirical information theory. An information metric to quantify AOS model errors in the climate is proposed here which incorporates both coarse-grained mean model errors as well as covariance ratios in a transformation invariant fashion. The subtle behavior of model errors with this information metric is quantified in an instructive statistically exactly solvable test model with direct relevance to climate change science including the prototype behavior of tracer gases such as CO(2). Formulas for identifying the most sensitive climate change directions using statistics of the present climate or an AOS model approximation are developed here; these formulas just involve finding the eigenvector associated with the largest eigenvalue of a quadratic form computed through suitable unperturbed climate statistics. These climate change concepts are illustrated on a statistically exactly solvable one-dimensional stochastic model with relevance for low frequency variability of the atmosphere. Viable algorithms for implementation of these concepts are discussed throughout the paper.

  18. Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kleijnen, J.P.C.; Helton, J.C.

    1999-04-01

    The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are consideredmore » for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.« less

  19. Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

    PubMed

    Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

    2005-01-01

    Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.

  20. Perception of Community Pharmacists towards Dispensing Errors in Community Pharmacy Setting in Gondar Town, Northwest Ethiopia

    PubMed Central

    2017-01-01

    Background Dispensing errors are inevitable occurrences in community pharmacies across the world. Objective This study aimed to identify the community pharmacists' perception towards dispensing errors in the community pharmacies in Gondar town, Northwest Ethiopia. Methods A cross-sectional study was conducted among 47 community pharmacists selected through convenience sampling. Data were analyzed using SPSS version 20. Descriptive statistics, Mann–Whitney U test, and Pearson's Chi-square test of independence were conducted with P ≤ 0.05 considered statistically significant. Result The majority of respondents were in the 23–28-year age group (N = 26, 55.3%) and with at least B.Pharm degree (N = 25, 53.2%). Poor prescription handwriting and similar/confusing names were perceived to be the main contributing factors while all the strategies and types of dispensing errors were highly acknowledged by the respondents. Group differences (P < 0.05) in opinions were largely due to educational level and age. Conclusion Dispensing errors were associated with prescribing quality and design of dispensary as well as dispensing procedures. Opinion differences relate to age and educational status of the respondents. PMID:28612023

Top