Sample records for random sampling error

  1. [Comparison study on sampling methods of Oncomelania hupensis snail survey in marshland schistosomiasis epidemic areas in China].

    PubMed

    An, Zhao; Wen-Xin, Zhang; Zhong, Yao; Yu-Kuan, Ma; Qing, Liu; Hou-Lang, Duan; Yi-di, Shang

    2016-06-29

    To optimize and simplify the survey method of Oncomelania hupensis snail in marshland endemic region of schistosomiasis and increase the precision, efficiency and economy of the snail survey. A quadrate experimental field was selected as the subject of 50 m×50 m size in Chayegang marshland near Henghu farm in the Poyang Lake region and a whole-covered method was adopted to survey the snails. The simple random sampling, systematic sampling and stratified random sampling methods were applied to calculate the minimum sample size, relative sampling error and absolute sampling error. The minimum sample sizes of the simple random sampling, systematic sampling and stratified random sampling methods were 300, 300 and 225, respectively. The relative sampling errors of three methods were all less than 15%. The absolute sampling errors were 0.221 7, 0.302 4 and 0.047 8, respectively. The spatial stratified sampling with altitude as the stratum variable is an efficient approach of lower cost and higher precision for the snail survey.

  2. Errors in radial velocity variance from Doppler wind lidar

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, H.; Barthelmie, R. J.; Doubrawa, P.

    A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less

  3. Errors in radial velocity variance from Doppler wind lidar

    DOE PAGES

    Wang, H.; Barthelmie, R. J.; Doubrawa, P.; ...

    2016-08-29

    A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less

  4. Errors in causal inference: an organizational schema for systematic error and random error.

    PubMed

    Suzuki, Etsuji; Tsuda, Toshihide; Mitsuhashi, Toshiharu; Mansournia, Mohammad Ali; Yamamoto, Eiji

    2016-11-01

    To provide an organizational schema for systematic error and random error in estimating causal measures, aimed at clarifying the concept of errors from the perspective of causal inference. We propose to divide systematic error into structural error and analytic error. With regard to random error, our schema shows its four major sources: nondeterministic counterfactuals, sampling variability, a mechanism that generates exposure events and measurement variability. Structural error is defined from the perspective of counterfactual reasoning and divided into nonexchangeability bias (which comprises confounding bias and selection bias) and measurement bias. Directed acyclic graphs are useful to illustrate this kind of error. Nonexchangeability bias implies a lack of "exchangeability" between the selected exposed and unexposed groups. A lack of exchangeability is not a primary concern of measurement bias, justifying its separation from confounding bias and selection bias. Many forms of analytic errors result from the small-sample properties of the estimator used and vanish asymptotically. Analytic error also results from wrong (misspecified) statistical models and inappropriate statistical methods. Our organizational schema is helpful for understanding the relationship between systematic error and random error from a previously less investigated aspect, enabling us to better understand the relationship between accuracy, validity, and precision. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Health plan auditing: 100-percent-of-claims vs. random-sample audits.

    PubMed

    Sillup, George P; Klimberg, Ronald K

    2011-01-01

    The objective of this study was to examine the relative efficacy of two different methodologies for auditing self-funded medical claim expenses: 100-percent-of-claims auditing versus random-sampling auditing. Multiple data sets of claim errors or 'exceptions' from two Fortune-100 corporations were analysed and compared to 100 simulated audits of 300- and 400-claim random samples. Random-sample simulations failed to identify a significant number and amount of the errors that ranged from $200,000 to $750,000. These results suggest that health plan expenses of corporations could be significantly reduced if they audited 100% of claims and embraced a zero-defect approach.

  6. [Exploration of the concept of genetic drift in genetics teaching of undergraduates].

    PubMed

    Wang, Chun-ming

    2016-01-01

    Genetic drift is one of the difficulties in teaching genetics due to its randomness and probability which could easily cause conceptual misunderstanding. The “sampling error" in its definition is often misunderstood because of the research method of “sampling", which disturbs the results and causes the random changes in allele frequency. I analyzed and compared the definitions of genetic drift in domestic and international genetic textbooks, and found that the definitions containing “sampling error" are widely adopted but are interpreted correctly in only a few textbooks. Here, the history of research on genetic drift, i.e., the contributions of Wright, Fisher and Kimura, is introduced. Moreover, I particularly describe two representative articles recently published about genetic drift teaching of undergraduates, which point out that misconceptions are inevitable for undergraduates during the studying process and also provide a preliminary solution. Combined with my own teaching practice, I suggest that the definition of genetic drift containing “sampling error" can be adopted with further interpretation, i.e., “sampling error" is random sampling among gametes when generating the next generation of alleles which is equivalent to a random sampling of all gametes participating in mating in gamete pool and has no relationship with artificial sampling in general genetics studies. This article may provide some help in genetics teaching.

  7. A Practical Methodology for Quantifying Random and Systematic Components of Unexplained Variance in a Wind Tunnel

    NASA Technical Reports Server (NTRS)

    Deloach, Richard; Obara, Clifford J.; Goodman, Wesley L.

    2012-01-01

    This paper documents a check standard wind tunnel test conducted in the Langley 0.3-Meter Transonic Cryogenic Tunnel (0.3M TCT) that was designed and analyzed using the Modern Design of Experiments (MDOE). The test designed to partition the unexplained variance of typical wind tunnel data samples into two constituent components, one attributable to ordinary random error, and one attributable to systematic error induced by covariate effects. Covariate effects in wind tunnel testing are discussed, with examples. The impact of systematic (non-random) unexplained variance on the statistical independence of sequential measurements is reviewed. The corresponding correlation among experimental errors is discussed, as is the impact of such correlation on experimental results generally. The specific experiment documented herein was organized as a formal test for the presence of unexplained variance in representative samples of wind tunnel data, in order to quantify the frequency with which such systematic error was detected, and its magnitude relative to ordinary random error. Levels of systematic and random error reported here are representative of those quantified in other facilities, as cited in the references.

  8. Quantifying errors without random sampling.

    PubMed

    Phillips, Carl V; LaPole, Luwanna M

    2003-06-12

    All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically overstate their precision. We argue that the precision implicit in typical reporting is problematic and sketch methods for quantifying the various sources of error, building up from simple examples that can be solved analytically to more complex cases. There are straightforward ways to partially quantify the uncertainty surrounding a parameter that is not characterized by random sampling, such as limiting reported significant figures. We present simple methods for doing such quantifications, and for incorporating them into calculations. More complicated methods become necessary when multiple sources of uncertainty must be combined. We demonstrate that Monte Carlo simulation, using available software, can estimate the uncertainty resulting from complicated calculations with many sources of uncertainty. We apply the method to the current estimate of the annual incidence of foodborne illness in the United States. Quantifying uncertainty from systematic errors is practical. Reporting this uncertainty would more honestly represent study results, help show the probability that estimated values fall within some critical range, and facilitate better targeting of further research.

  9. Estimation of population mean in the presence of measurement error and non response under stratified random sampling

    PubMed Central

    Shabbir, Javid

    2018-01-01

    In the present paper we propose an improved class of estimators in the presence of measurement error and non-response under stratified random sampling for estimating the finite population mean. The theoretical and numerical studies reveal that the proposed class of estimators performs better than other existing estimators. PMID:29401519

  10. Simulation of the Effects of Random Measurement Errors

    ERIC Educational Resources Information Center

    Kinsella, I. A.; Hannaidh, P. B. O.

    1978-01-01

    Describes a simulation method for measurement of errors that requires calculators and tables of random digits. Each student simulates the random behaviour of the component variables in the function and by combining the results of all students, the outline of the sampling distribution of the function can be obtained. (GA)

  11. Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation

    NASA Astrophysics Data System (ADS)

    Räisänen, Petri; Barker, W. Howard

    2004-07-01

    The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.

  12. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST)

    PubMed Central

    Xu, Chonggang; Gertner, George

    2013-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. PMID:24143037

  13. Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST).

    PubMed

    Xu, Chonggang; Gertner, George

    2011-01-01

    Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements.

  14. Precipitation and Latent Heating Distributions from Satellite Passive Microwave Radiometry. Part 1; Improved Method and Uncertainties

    NASA Technical Reports Server (NTRS)

    Olson, William S.; Kummerow, Christian D.; Yang, Song; Petty, Grant W.; Tao, Wei-Kuo; Bell, Thomas L.; Braun, Scott A.; Wang, Yansen; Lang, Stephen E.; Johnson, Daniel E.; hide

    2006-01-01

    A revised Bayesian algorithm for estimating surface rain rate, convective rain proportion, and latent heating profiles from satellite-borne passive microwave radiometer observations over ocean backgrounds is described. The algorithm searches a large database of cloud-radiative model simulations to find cloud profiles that are radiatively consistent with a given set of microwave radiance measurements. The properties of these radiatively consistent profiles are then composited to obtain best estimates of the observed properties. The revised algorithm is supported by an expanded and more physically consistent database of cloud-radiative model simulations. The algorithm also features a better quantification of the convective and nonconvective contributions to total rainfall, a new geographic database, and an improved representation of background radiances in rain-free regions. Bias and random error estimates are derived from applications of the algorithm to synthetic radiance data, based upon a subset of cloud-resolving model simulations, and from the Bayesian formulation itself. Synthetic rain-rate and latent heating estimates exhibit a trend of high (low) bias for low (high) retrieved values. The Bayesian estimates of random error are propagated to represent errors at coarser time and space resolutions, based upon applications of the algorithm to TRMM Microwave Imager (TMI) data. Errors in TMI instantaneous rain-rate estimates at 0.5 -resolution range from approximately 50% at 1 mm/h to 20% at 14 mm/h. Errors in collocated spaceborne radar rain-rate estimates are roughly 50%-80% of the TMI errors at this resolution. The estimated algorithm random error in TMI rain rates at monthly, 2.5deg resolution is relatively small (less than 6% at 5 mm day.1) in comparison with the random error resulting from infrequent satellite temporal sampling (8%-35% at the same rain rate). Percentage errors resulting from sampling decrease with increasing rain rate, and sampling errors in latent heating rates follow the same trend. Averaging over 3 months reduces sampling errors in rain rates to 6%-15% at 5 mm day.1, with proportionate reductions in latent heating sampling errors.

  15. Evaluation of Bayesian Sequential Proportion Estimation Using Analyst Labels

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Abotteen, K. M. (Principal Investigator)

    1980-01-01

    The author has identified the following significant results. A total of ten Large Area Crop Inventory Experiment Phase 3 blind sites and analyst-interpreter labels were used in a study to compare proportional estimates obtained by the Bayes sequential procedure with estimates obtained from simple random sampling and from Procedure 1. The analyst error rate using the Bayes technique was shown to be no greater than that for the simple random sampling. Also, the segment proportion estimates produced using this technique had smaller bias and mean squared errors than the estimates produced using either simple random sampling or Procedure 1.

  16. Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

    PubMed

    Sztepanacz, Jacqueline L; Blows, Mark W

    2017-07-01

    The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.

  17. The decline and fall of Type II error rates

    Treesearch

    Steve Verrill; Mark Durst

    2005-01-01

    For general linear models with normally distributed random errors, the probability of a Type II error decreases exponentially as a function of sample size. This potentially rapid decline reemphasizes the importance of performing power calculations.

  18. The Expected Sample Variance of Uncorrelated Random Variables with a Common Mean and Some Applications in Unbalanced Random Effects Models

    ERIC Educational Resources Information Center

    Vardeman, Stephen B.; Wendelberger, Joanne R.

    2005-01-01

    There is a little-known but very simple generalization of the standard result that for uncorrelated random variables with common mean [mu] and variance [sigma][superscript 2], the expected value of the sample variance is [sigma][superscript 2]. The generalization justifies the use of the usual standard error of the sample mean in possibly…

  19. Maximum type I error rate inflation from sample size reassessment when investigators are blind to treatment labels.

    PubMed

    Żebrowska, Magdalena; Posch, Martin; Magirr, Dominic

    2016-05-30

    Consider a parallel group trial for the comparison of an experimental treatment to a control, where the second-stage sample size may depend on the blinded primary endpoint data as well as on additional blinded data from a secondary endpoint. For the setting of normally distributed endpoints, we demonstrate that this may lead to an inflation of the type I error rate if the null hypothesis holds for the primary but not the secondary endpoint. We derive upper bounds for the inflation of the type I error rate, both for trials that employ random allocation and for those that use block randomization. We illustrate the worst-case sample size reassessment rule in a case study. For both randomization strategies, the maximum type I error rate increases with the effect size in the secondary endpoint and the correlation between endpoints. The maximum inflation increases with smaller block sizes if information on the block size is used in the reassessment rule. Based on our findings, we do not question the well-established use of blinded sample size reassessment methods with nuisance parameter estimates computed from the blinded interim data of the primary endpoint. However, we demonstrate that the type I error rate control of these methods relies on the application of specific, binding, pre-planned and fully algorithmic sample size reassessment rules and does not extend to general or unplanned sample size adjustments based on blinded data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  20. Linear discriminant analysis with misallocation in training samples

    NASA Technical Reports Server (NTRS)

    Chhikara, R. (Principal Investigator); Mckeon, J.

    1982-01-01

    Linear discriminant analysis for a two-class case is studied in the presence of misallocation in training samples. A general appraoch to modeling of mislocation is formulated, and the mean vectors and covariance matrices of the mixture distributions are derived. The asymptotic distribution of the discriminant boundary is obtained and the asymptotic first two moments of the two types of error rate given. Certain numerical results for the error rates are presented by considering the random and two non-random misallocation models. It is shown that when the allocation procedure for training samples is objectively formulated, the effect of misallocation on the error rates of the Bayes linear discriminant rule can almost be eliminated. If, however, this is not possible, the use of Fisher rule may be preferred over the Bayes rule.

  1. A method to estimate the effect of deformable image registration uncertainties on daily dose mapping

    PubMed Central

    Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin

    2012-01-01

    Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766

  2. The (mis)reporting of statistical results in psychology journals.

    PubMed

    Bakker, Marjan; Wicherts, Jelte M

    2011-09-01

    In order to study the prevalence, nature (direction), and causes of reporting errors in psychology, we checked the consistency of reported test statistics, degrees of freedom, and p values in a random sample of high- and low-impact psychology journals. In a second study, we established the generality of reporting errors in a random sample of recent psychological articles. Our results, on the basis of 281 articles, indicate that around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers' expectations. We classified the most common errors and contacted authors to shed light on the origins of the errors.

  3. Eddy-covariance data with low signal-to-noise ratio: time-lag determination, uncertainties and limit of detection

    NASA Astrophysics Data System (ADS)

    Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.

    2015-10-01

    All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. Here, we are applying a consistent approach based on auto- and cross-covariance functions to quantify the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining data sets from several analysers and using simulations, we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time lag eliminates these effects (provided the time lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.

  4. Eddy-covariance data with low signal-to-noise ratio: time-lag determination, uncertainties and limit of detection

    NASA Astrophysics Data System (ADS)

    Langford, B.; Acton, W.; Ammann, C.; Valach, A.; Nemitz, E.

    2015-03-01

    All eddy-covariance flux measurements are associated with random uncertainties which are a combination of sampling error due to natural variability in turbulence and sensor noise. The former is the principal error for systems where the signal-to-noise ratio of the analyser is high, as is usually the case when measuring fluxes of heat, CO2 or H2O. Where signal is limited, which is often the case for measurements of other trace gases and aerosols, instrument uncertainties dominate. We are here applying a consistent approach based on auto- and cross-covariance functions to quantifying the total random flux error and the random error due to instrument noise separately. As with previous approaches, the random error quantification assumes that the time-lag between wind and concentration measurement is known. However, if combined with commonly used automated methods that identify the individual time-lag by looking for the maximum in the cross-covariance function of the two entities, analyser noise additionally leads to a systematic bias in the fluxes. Combining datasets from several analysers and using simulations we show that the method of time-lag determination becomes increasingly important as the magnitude of the instrument error approaches that of the sampling error. The flux bias can be particularly significant for disjunct data, whereas using a prescribed time-lag eliminates these effects (provided the time-lag does not fluctuate unduly over time). We also demonstrate that when sampling at higher elevations, where low frequency turbulence dominates and covariance peaks are broader, both the probability and magnitude of bias are magnified. We show that the statistical significance of noisy flux data can be increased (limit of detection can be decreased) by appropriate averaging of individual fluxes, but only if systematic biases are avoided by using a prescribed time-lag. Finally, we make recommendations for the analysis and reporting of data with low signal-to-noise and their associated errors.

  5. Some practical problems in implementing randomization.

    PubMed

    Downs, Matt; Tucker, Kathryn; Christ-Schmidt, Heidi; Wittes, Janet

    2010-06-01

    While often theoretically simple, implementing randomization to treatment in a masked, but confirmable, fashion can prove difficult in practice. At least three categories of problems occur in randomization: (1) bad judgment in the choice of method, (2) design and programming errors in implementing the method, and (3) human error during the conduct of the trial. This article focuses on these latter two types of errors, dealing operationally with what can go wrong after trial designers have selected the allocation method. We offer several case studies and corresponding recommendations for lessening the frequency of problems in allocating treatment or for mitigating the consequences of errors. Recommendations include: (1) reviewing the randomization schedule before starting a trial, (2) being especially cautious of systems that use on-demand random number generators, (3) drafting unambiguous randomization specifications, (4) performing thorough testing before entering a randomization system into production, (5) maintaining a dataset that captures the values investigators used to randomize participants, thereby allowing the process of treatment allocation to be reproduced and verified, (6) resisting the urge to correct errors that occur in individual treatment assignments, (7) preventing inadvertent unmasking to treatment assignments in kit allocations, and (8) checking a sample of study drug kits to allow detection of errors in drug packaging and labeling. Although we performed a literature search of documented randomization errors, the examples that we provide and the resultant recommendations are based largely on our own experience in industry-sponsored clinical trials. We do not know how representative our experience is or how common errors of the type we have seen occur. Our experience underscores the importance of verifying the integrity of the treatment allocation process before and during a trial. Clinical Trials 2010; 7: 235-245. http://ctj.sagepub.com.

  6. Iterative random vs. Kennard-Stone sampling for IR spectrum-based classification task using PLS2-DA

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2018-04-01

    External testing (ET) is preferred over auto-prediction (AP) or k-fold-cross-validation in estimating more realistic predictive ability of a statistical model. With IR spectra, Kennard-stone (KS) sampling algorithm is often used to split the data into training and test sets, i.e. respectively for model construction and for model testing. On the other hand, iterative random sampling (IRS) has not been the favored choice though it is theoretically more likely to produce reliable estimation. The aim of this preliminary work is to compare performances of KS and IRS in sampling a representative training set from an attenuated total reflectance - Fourier transform infrared spectral dataset (of four varieties of blue gel pen inks) for PLS2-DA modeling. The `best' performance achievable from the dataset is estimated with AP on the full dataset (APF, error). Both IRS (n = 200) and KS were used to split the dataset in the ratio of 7:3. The classic decision rule (i.e. maximum value-based) is employed for new sample prediction via partial least squares - discriminant analysis (PLS2-DA). Error rate of each model was estimated repeatedly via: (a) AP on full data (APF, error); (b) AP on training set (APS, error); and (c) ET on the respective test set (ETS, error). A good PLS2-DA model is expected to produce APS, error and EVS, error that is similar to the APF, error. Bearing that in mind, the similarities between (a) APS, error vs. APF, error; (b) ETS, error vs. APF, error and; (c) APS, error vs. ETS, error were evaluated using correlation tests (i.e. Pearson and Spearman's rank test), using series of PLS2-DA models computed from KS-set and IRS-set, respectively. Overall, models constructed from IRS-set exhibits more similarities between the internal and external error rates than the respective KS-set, i.e. less risk of overfitting. In conclusion, IRS is more reliable than KS in sampling representative training set.

  7. Reference-free error estimation for multiple measurement methods.

    PubMed

    Madan, Hennadii; Pernuš, Franjo; Špiclin, Žiga

    2018-01-01

    We present a computational framework to select the most accurate and precise method of measurement of a certain quantity, when there is no access to the true value of the measurand. A typical use case is when several image analysis methods are applied to measure the value of a particular quantitative imaging biomarker from the same images. The accuracy of each measurement method is characterized by systematic error (bias), which is modeled as a polynomial in true values of measurand, and the precision as random error modeled with a Gaussian random variable. In contrast to previous works, the random errors are modeled jointly across all methods, thereby enabling the framework to analyze measurement methods based on similar principles, which may have correlated random errors. Furthermore, the posterior distribution of the error model parameters is estimated from samples obtained by Markov chain Monte-Carlo and analyzed to estimate the parameter values and the unknown true values of the measurand. The framework was validated on six synthetic and one clinical dataset containing measurements of total lesion load, a biomarker of neurodegenerative diseases, which was obtained with four automatic methods by analyzing brain magnetic resonance images. The estimates of bias and random error were in a good agreement with the corresponding least squares regression estimates against a reference.

  8. Interval sampling methods and measurement error: a computer simulation.

    PubMed

    Wirth, Oliver; Slaven, James; Taylor, Matthew A

    2014-01-01

    A simulation study was conducted to provide a more thorough account of measurement error associated with interval sampling methods. A computer program simulated the application of momentary time sampling, partial-interval recording, and whole-interval recording methods on target events randomly distributed across an observation period. The simulation yielded measures of error for multiple combinations of observation period, interval duration, event duration, and cumulative event duration. The simulations were conducted up to 100 times to yield measures of error variability. Although the present simulation confirmed some previously reported characteristics of interval sampling methods, it also revealed many new findings that pertain to each method's inherent strengths and weaknesses. The analysis and resulting error tables can help guide the selection of the most appropriate sampling method for observation-based behavioral assessments. © Society for the Experimental Analysis of Behavior.

  9. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  10. A New Stratified Sampling Procedure which Decreases Error Estimation of Varroa Mite Number on Sticky Boards.

    PubMed

    Kretzschmar, A; Durand, E; Maisonnasse, A; Vallon, J; Le Conte, Y

    2015-06-01

    A new procedure of stratified sampling is proposed in order to establish an accurate estimation of Varroa destructor populations on sticky bottom boards of the hive. It is based on the spatial sampling theory that recommends using regular grid stratification in the case of spatially structured process. The distribution of varroa mites on sticky board being observed as spatially structured, we designed a sampling scheme based on a regular grid with circles centered on each grid element. This new procedure is then compared with a former method using partially random sampling. Relative error improvements are exposed on the basis of a large sample of simulated sticky boards (n=20,000) which provides a complete range of spatial structures, from a random structure to a highly frame driven structure. The improvement of varroa mite number estimation is then measured by the percentage of counts with an error greater than a given level. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Value stream mapping of the Pap test processing procedure: a lean approach to improve quality and efficiency.

    PubMed

    Michael, Claire W; Naik, Kalyani; McVicker, Michael

    2013-05-01

    We developed a value stream map (VSM) of the Papanicolaou test procedure to identify opportunities to reduce waste and errors, created a new VSM, and implemented a new process emphasizing Lean tools. Preimplementation data revealed the following: (1) processing time (PT) for 1,140 samples averaged 54 hours; (2) 27 accessioning errors were detected on review of 357 random requisitions (7.6%); (3) 5 of the 20,060 tests had labeling errors that had gone undetected in the processing stage. Four were detected later during specimen processing but 1 reached the reporting stage. Postimplementation data were as follows: (1) PT for 1,355 samples averaged 31 hours; (2) 17 accessioning errors were detected on review of 385 random requisitions (4.4%); and (3) no labeling errors were undetected. Our results demonstrate that implementation of Lean methods, such as first-in first-out processes and minimizing batch size by staff actively participating in the improvement process, allows for higher quality, greater patient safety, and improved efficiency.

  12. Quantifying Adventitious Error in a Covariance Structure as a Random Effect

    PubMed Central

    Wu, Hao; Browne, Michael W.

    2017-01-01

    We present an approach to quantifying errors in covariance structures in which adventitious error, identified as the process underlying the discrepancy between the population and the structured model, is explicitly modeled as a random effect with a distribution, and the dispersion parameter of this distribution to be estimated gives a measure of misspecification. Analytical properties of the resultant procedure are investigated and the measure of misspecification is found to be related to the RMSEA. An algorithm is developed for numerical implementation of the procedure. The consistency and asymptotic sampling distributions of the estimators are established under a new asymptotic paradigm and an assumption weaker than the standard Pitman drift assumption. Simulations validate the asymptotic sampling distributions and demonstrate the importance of accounting for the variations in the parameter estimates due to adventitious error. Two examples are also given as illustrations. PMID:25813463

  13. The Effect of Random Error on Diagnostic Accuracy Illustrated with the Anthropometric Diagnosis of Malnutrition

    PubMed Central

    2016-01-01

    Background It is often thought that random measurement error has a minor effect upon the results of an epidemiological survey. Theoretically, errors of measurement should always increase the spread of a distribution. Defining an illness by having a measurement outside an established healthy range will lead to an inflated prevalence of that condition if there are measurement errors. Methods and results A Monte Carlo simulation was conducted of anthropometric assessment of children with malnutrition. Random errors of increasing magnitude were imposed upon the populations and showed that there was an increase in the standard deviation with each of the errors that became exponentially greater with the magnitude of the error. The potential magnitude of the resulting error of reported prevalence of malnutrition were compared with published international data and found to be of sufficient magnitude to make a number of surveys and the numerous reports and analyses that used these data unreliable. Conclusions The effect of random error in public health surveys and the data upon which diagnostic cut-off points are derived to define “health” has been underestimated. Even quite modest random errors can more than double the reported prevalence of conditions such as malnutrition. Increasing sample size does not address this problem, and may even result in less accurate estimates. More attention needs to be paid to the selection, calibration and maintenance of instruments, measurer selection, training & supervision, routine estimation of the likely magnitude of errors using standardization tests, use of statistical likelihood of error to exclude data from analysis and full reporting of these procedures in order to judge the reliability of survey reports. PMID:28030627

  14. A quarter of a century of the DBQ: some supplementary notes on its validity with regard to accidents.

    PubMed

    de Winter, Joost C F; Dodou, Dimitra; Stanton, Neville A

    2015-01-01

    This article synthesises the latest information on the relationship between the Driver Behaviour Questionnaire (DBQ) and accidents. We show by means of computer simulation that correlations with accidents are necessarily small because accidents are rare events. An updated meta-analysis on the zero-order correlations between the DBQ and self-reported accidents yielded an overall r of .13 (fixed-effect and random-effects models) for violations (57,480 participants; 67 samples) and .09 (fixed-effect and random-effects models) for errors (66,028 participants; 56 samples). An analysis of a previously published DBQ dataset (975 participants) showed that by aggregating across four measurement occasions, the correlation coefficient with self-reported accidents increased from .14 to .24 for violations and from .11 to .19 for errors. Our meta-analysis also showed that DBQ violations (r = .24; 6353 participants; 20 samples) but not DBQ errors (r = - .08; 1086 participants; 16 samples) correlated with recorded vehicle speed. Practitioner Summary: The DBQ is probably the most widely used self-report questionnaire in driver behaviour research. This study shows that DBQ violations and errors correlate moderately with self-reported traffic accidents.

  15. Radar error statistics for the space shuttle

    NASA Technical Reports Server (NTRS)

    Lear, W. M.

    1979-01-01

    Radar error statistics of C-band and S-band that are recommended for use with the groundtracking programs to process space shuttle tracking data are presented. The statistics are divided into two parts: bias error statistics, using the subscript B, and high frequency error statistics, using the subscript q. Bias errors may be slowly varying to constant. High frequency random errors (noise) are rapidly varying and may or may not be correlated from sample to sample. Bias errors were mainly due to hardware defects and to errors in correction for atmospheric refraction effects. High frequency noise was mainly due to hardware and due to atmospheric scintillation. Three types of atmospheric scintillation were identified: horizontal, vertical, and line of sight. This was the first time that horizontal and line of sight scintillations were identified.

  16. Discrepancy-based error estimates for Quasi-Monte Carlo III. Error distributions and central limits

    NASA Astrophysics Data System (ADS)

    Hoogland, Jiri; Kleiss, Ronald

    1997-04-01

    In Quasi-Monte Carlo integration, the integration error is believed to be generally smaller than in classical Monte Carlo with the same number of integration points. Using an appropriate definition of an ensemble of quasi-random point sets, we derive various results on the probability distribution of the integration error, which can be compared to the standard Central Limit Theorem for normal stochastic sampling. In many cases, a Gaussian error distribution is obtained.

  17. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  18. [Errors in Peruvian medical journals references].

    PubMed

    Huamaní, Charles; Pacheco-Romero, José

    2009-01-01

    References are fundamental in our studies; an adequate selection is asimportant as an adequate description. To determine the number of errors in a sample of references found in Peruvian medical journals. We reviewed 515 scientific papers references selected by systematic randomized sampling and corroborated reference information with the original document or its citation in Pubmed, LILACS or SciELO-Peru. We found errors in 47,6% (245) of the references, identifying 372 types of errors; the most frequent were errors in presentation style (120), authorship (100) and title (100), mainly due to spelling mistakes (91). References error percentage was high, varied and multiple. We suggest systematic revision of references in the editorial process as well as to extend the discussion on this theme. references, periodicals, research, bibliometrics.

  19. Random errors of oceanic monthly rainfall derived from SSM/I using probability distribution functions

    NASA Technical Reports Server (NTRS)

    Chang, Alfred T. C.; Chiu, Long S.; Wilheit, Thomas T.

    1993-01-01

    Global averages and random errors associated with the monthly oceanic rain rates derived from the Special Sensor Microwave/Imager (SSM/I) data using the technique developed by Wilheit et al. (1991) are computed. Accounting for the beam-filling bias, a global annual average rain rate of 1.26 m is computed. The error estimation scheme is based on the existence of independent (morning and afternoon) estimates of the monthly mean. Calculations show overall random errors of about 50-60 percent for each 5 deg x 5 deg box. The results are insensitive to different sampling strategy (odd and even days of the month). Comparison of the SSM/I estimates with raingage data collected at the Pacific atoll stations showed a low bias of about 8 percent, a correlation of 0.7, and an rms difference of 55 percent.

  20. Quantification of errors in ordinal outcome scales using shannon entropy: effect on sample size calculations.

    PubMed

    Mandava, Pitchaiah; Krumpelman, Chase S; Shah, Jharna N; White, Donna L; Kent, Thomas A

    2013-01-01

    Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS), a range of scores ("Shift") is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon's model, we quantified errors of the "Shift" compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. We identified 35 randomized stroke trials that met inclusion criteria. Each trial's mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for "shift" and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by "shift" mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD). Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall p<0.001). Taking errors into account, SAINT I would have required 24% more subjects than were randomized. We show when uncertainty in assessments is considered, the lowest error rates are with dichotomization. While using the full range of mRS is conceptually appealing, a gain of information is counter-balanced by a decrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We provide the user with programs to calculate and incorporate errors into sample size estimation.

  1. On the predictivity of pore-scale simulations: Estimating uncertainties with multilevel Monte Carlo

    NASA Astrophysics Data System (ADS)

    Icardi, Matteo; Boccardo, Gianluca; Tempone, Raúl

    2016-09-01

    A fast method with tunable accuracy is proposed to estimate errors and uncertainties in pore-scale and Digital Rock Physics (DRP) problems. The overall predictivity of these studies can be, in fact, hindered by many factors including sample heterogeneity, computational and imaging limitations, model inadequacy and not perfectly known physical parameters. The typical objective of pore-scale studies is the estimation of macroscopic effective parameters such as permeability, effective diffusivity and hydrodynamic dispersion. However, these are often non-deterministic quantities (i.e., results obtained for specific pore-scale sample and setup are not totally reproducible by another ;equivalent; sample and setup). The stochastic nature can arise due to the multi-scale heterogeneity, the computational and experimental limitations in considering large samples, and the complexity of the physical models. These approximations, in fact, introduce an error that, being dependent on a large number of complex factors, can be modeled as random. We propose a general simulation tool, based on multilevel Monte Carlo, that can reduce drastically the computational cost needed for computing accurate statistics of effective parameters and other quantities of interest, under any of these random errors. This is, to our knowledge, the first attempt to include Uncertainty Quantification (UQ) in pore-scale physics and simulation. The method can also provide estimates of the discretization error and it is tested on three-dimensional transport problems in heterogeneous materials, where the sampling procedure is done by generation algorithms able to reproduce realistic consolidated and unconsolidated random sphere and ellipsoid packings and arrangements. A totally automatic workflow is developed in an open-source code [1], that include rigid body physics and random packing algorithms, unstructured mesh discretization, finite volume solvers, extrapolation and post-processing techniques. The proposed method can be efficiently used in many porous media applications for problems such as stochastic homogenization/upscaling, propagation of uncertainty from microscopic fluid and rock properties to macro-scale parameters, robust estimation of Representative Elementary Volume size for arbitrary physics.

  2. Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats

    USGS Publications Warehouse

    Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.

    2012-01-01

    This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.

  3. Sampling Errors in Monthly Rainfall Totals for TRMM and SSM/I, Based on Statistics of Retrieved Rain Rates and Simple Models

    NASA Technical Reports Server (NTRS)

    Bell, Thomas L.; Kundu, Prasun K.; Einaudi, Franco (Technical Monitor)

    2000-01-01

    Estimates from TRMM satellite data of monthly total rainfall over an area are subject to substantial sampling errors due to the limited number of visits to the area by the satellite during the month. Quantitative comparisons of TRMM averages with data collected by other satellites and by ground-based systems require some estimate of the size of this sampling error. A method of estimating this sampling error based on the actual statistics of the TRMM observations and on some modeling work has been developed. "Sampling error" in TRMM monthly averages is defined here relative to the monthly total a hypothetical satellite permanently stationed above the area would have reported. "Sampling error" therefore includes contributions from the random and systematic errors introduced by the satellite remote sensing system. As part of our long-term goal of providing error estimates for each grid point accessible to the TRMM instruments, sampling error estimates for TRMM based on rain retrievals from TRMM microwave (TMI) data are compared for different times of the year and different oceanic areas (to minimize changes in the statistics due to algorithmic differences over land and ocean). Changes in sampling error estimates due to changes in rain statistics due 1) to evolution of the official algorithms used to process the data, and 2) differences from other remote sensing systems such as the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave/Imager (SSM/I), are analyzed.

  4. Analysis of Errors Committed by Physics Students in Secondary Schools in Ilorin Metropolis, Nigeria

    ERIC Educational Resources Information Center

    Omosewo, Esther Ore; Akanbi, Abdulrasaq Oladimeji

    2013-01-01

    The study attempt to find out the types of error committed and influence of gender on the type of error committed by senior secondary school physics students in metropolis. Six (6) schools were purposively chosen for the study. One hundred and fifty five students' scripts were randomly sampled for the study. Joint Mock physics essay questions…

  5. Sampling error in timber surveys

    Treesearch

    Austin Hasel

    1938-01-01

    Various sampling strategies are evaluated for efficiency in an interior ponderosa pine forest. In a 5760 acre tract, efficiency was gained by stratifying into quarter acre blocks and sampling randomly from within. A systematic cruise was found to be superior for volume estimation.

  6. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  7. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  8. 10 CFR 74.45 - Measurements and measurement control.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... measurements, obtaining samples, and performing laboratory analyses for element concentration and isotope... of random error behavior. On a predetermined schedule, the program shall include, as appropriate: (i) Replicate analyses of individual samples; (ii) Analysis of replicate process samples; (iii) Replicate volume...

  9. Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh My!

    PubMed

    Vetter, Thomas R; Mascha, Edward J

    2017-09-01

    Epidemiologists seek to make a valid inference about the causal effect between an exposure and a disease in a specific population, using representative sample data from a specific population. Clinical researchers likewise seek to make a valid inference about the association between an intervention and outcome(s) in a specific population, based upon their randomly collected, representative sample data. Both do so by using the available data about the sample variable to make a valid estimate about its corresponding or underlying, but unknown population parameter. Random error in an experiment can be due to the natural, periodic fluctuation or variation in the accuracy or precision of virtually any data sampling technique or health measurement tool or scale. In a clinical research study, random error can be due to not only innate human variability but also purely chance. Systematic error in an experiment arises from an innate flaw in the data sampling technique or measurement instrument. In the clinical research setting, systematic error is more commonly referred to as systematic bias. The most commonly encountered types of bias in anesthesia, perioperative, critical care, and pain medicine research include recall bias, observational bias (Hawthorne effect), attrition bias, misclassification or informational bias, and selection bias. A confounding variable is a factor associated with both the exposure of interest and the outcome of interest. A confounding variable (confounding factor or confounder) is a variable that correlates (positively or negatively) with both the exposure and outcome. Confounding is typically not an issue in a randomized trial because the randomized groups are sufficiently balanced on all potential confounding variables, both observed and nonobserved. However, confounding can be a major problem with any observational (nonrandomized) study. Ignoring confounding in an observational study will often result in a "distorted" or incorrect estimate of the association or treatment effect. Interaction among variables, also known as effect modification, exists when the effect of 1 explanatory variable on the outcome depends on the particular level or value of another explanatory variable. Bias and confounding are common potential explanations for statistically significant associations between exposure and outcome when the true relationship is noncausal. Understanding interactions is vital to proper interpretation of treatment effects. These complex concepts should be consistently and appropriately considered whenever one is not only designing but also analyzing and interpreting data from a randomized trial or observational study.

  10. Within-Tunnel Variations in Pressure Data for Three Transonic Wind Tunnels

    NASA Technical Reports Server (NTRS)

    DeLoach, Richard

    2014-01-01

    This paper compares the results of pressure measurements made on the same test article with the same test matrix in three transonic wind tunnels. A comparison is presented of the unexplained variance associated with polar replicates acquired in each tunnel. The impact of a significance component of systematic (not random) unexplained variance is reviewed, and the results of analyses of variance are presented to assess the degree of significant systematic error in these representative wind tunnel tests. Total uncertainty estimates are reported for 140 samples of pressure data, quantifying the effects of within-polar random errors and between-polar systematic bias errors.

  11. Previous Estimates of Mitochondrial DNA Mutation Level Variance Did Not Account for Sampling Error: Comparing the mtDNA Genetic Bottleneck in Mice and Humans

    PubMed Central

    Wonnapinij, Passorn; Chinnery, Patrick F.; Samuels, David C.

    2010-01-01

    In cases of inherited pathogenic mitochondrial DNA (mtDNA) mutations, a mother and her offspring generally have large and seemingly random differences in the amount of mutated mtDNA that they carry. Comparisons of measured mtDNA mutation level variance values have become an important issue in determining the mechanisms that cause these large random shifts in mutation level. These variance measurements have been made with samples of quite modest size, which should be a source of concern because higher-order statistics, such as variance, are poorly estimated from small sample sizes. We have developed an analysis of the standard error of variance from a sample of size n, and we have defined error bars for variance measurements based on this standard error. We calculate variance error bars for several published sets of measurements of mtDNA mutation level variance and show how the addition of the error bars alters the interpretation of these experimental results. We compare variance measurements from human clinical data and from mouse models and show that the mutation level variance is clearly higher in the human data than it is in the mouse models at both the primary oocyte and offspring stages of inheritance. We discuss how the standard error of variance can be used in the design of experiments measuring mtDNA mutation level variance. Our results show that variance measurements based on fewer than 20 measurements are generally unreliable and ideally more than 50 measurements are required to reliably compare variances with less than a 2-fold difference. PMID:20362273

  12. Measurement variability error for estimates of volume change

    Treesearch

    James A. Westfall; Paul L. Patterson

    2007-01-01

    Using quality assurance data, measurement variability distributions were developed for attributes that affect tree volume prediction. Random deviations from the measurement variability distributions were applied to 19381 remeasured sample trees in Maine. The additional error due to measurement variation and measurement bias was estimated via a simulation study for...

  13. Error Distribution Evaluation of the Third Vanishing Point Based on Random Statistical Simulation

    NASA Astrophysics Data System (ADS)

    Li, C.

    2012-07-01

    POS, integrated by GPS / INS (Inertial Navigation Systems), has allowed rapid and accurate determination of position and attitude of remote sensing equipment for MMS (Mobile Mapping Systems). However, not only does INS have system error, but also it is very expensive. Therefore, in this paper error distributions of vanishing points are studied and tested in order to substitute INS for MMS in some special land-based scene, such as ground façade where usually only two vanishing points can be detected. Thus, the traditional calibration approach based on three orthogonal vanishing points is being challenged. In this article, firstly, the line clusters, which parallel to each others in object space and correspond to the vanishing points, are detected based on RANSAC (Random Sample Consensus) and parallelism geometric constraint. Secondly, condition adjustment with parameters is utilized to estimate nonlinear error equations of two vanishing points (VX, VY). How to set initial weights for the adjustment solution of single image vanishing points is presented. Solving vanishing points and estimating their error distributions base on iteration method with variable weights, co-factor matrix and error ellipse theory. Thirdly, under the condition of known error ellipses of two vanishing points (VX, VY) and on the basis of the triangle geometric relationship of three vanishing points, the error distribution of the third vanishing point (VZ) is calculated and evaluated by random statistical simulation with ignoring camera distortion. Moreover, Monte Carlo methods utilized for random statistical estimation are presented. Finally, experimental results of vanishing points coordinate and their error distributions are shown and analyzed.

  14. USGS Blind Sample Project: monitoring and evaluating laboratory analytical quality

    USGS Publications Warehouse

    Ludtke, Amy S.; Woodworth, Mark T.

    1997-01-01

    The U.S. Geological Survey (USGS) collects and disseminates information about the Nation's water resources. Surface- and ground-water samples are collected and sent to USGS laboratories for chemical analyses. The laboratories identify and quantify the constituents in the water samples. Random and systematic errors occur during sample handling, chemical analysis, and data processing. Although all errors cannot be eliminated from measurements, the magnitude of their uncertainty can be estimated and tracked over time. Since 1981, the USGS has operated an independent, external, quality-assurance project called the Blind Sample Project (BSP). The purpose of the BSP is to monitor and evaluate the quality of laboratory analytical results through the use of double-blind quality-control (QC) samples. The information provided by the BSP assists the laboratories in detecting and correcting problems in the analytical procedures. The information also can aid laboratory users in estimating the extent that laboratory errors contribute to the overall errors in their environmental data.

  15. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    PubMed

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. The effect of covariate mean differences on the standard error and confidence interval for the comparison of treatment means.

    PubMed

    Liu, Xiaofeng Steven

    2011-05-01

    The use of covariates is commonly believed to reduce the unexplained error variance and the standard error for the comparison of treatment means, but the reduction in the standard error is neither guaranteed nor uniform over different sample sizes. The covariate mean differences between the treatment conditions can inflate the standard error of the covariate-adjusted mean difference and can actually produce a larger standard error for the adjusted mean difference than that for the unadjusted mean difference. When the covariate observations are conceived of as randomly varying from one study to another, the covariate mean differences can be related to a Hotelling's T(2) . Using this Hotelling's T(2) statistic, one can always find a minimum sample size to achieve a high probability of reducing the standard error and confidence interval width for the adjusted mean difference. ©2010 The British Psychological Society.

  17. Speeding up Coarse Point Cloud Registration by Threshold-Independent Baysac Match Selection

    NASA Astrophysics Data System (ADS)

    Kang, Z.; Lindenbergh, R.; Pu, S.

    2016-06-01

    This paper presents an algorithm for the automatic registration of terrestrial point clouds by match selection using an efficiently conditional sampling method -- threshold-independent BaySAC (BAYes SAmpling Consensus) and employs the error metric of average point-to-surface residual to reduce the random measurement error and then approach the real registration error. BaySAC and other basic sampling algorithms usually need to artificially determine a threshold by which inlier points are identified, which leads to a threshold-dependent verification process. Therefore, we applied the LMedS method to construct the cost function that is used to determine the optimum model to reduce the influence of human factors and improve the robustness of the model estimate. Point-to-point and point-to-surface error metrics are most commonly used. However, point-to-point error in general consists of at least two components, random measurement error and systematic error as a result of a remaining error in the found rigid body transformation. Thus we employ the measure of the average point-to-surface residual to evaluate the registration accuracy. The proposed approaches, together with a traditional RANSAC approach, are tested on four data sets acquired by three different scanners in terms of their computational efficiency and quality of the final registration. The registration results show the st.dev of the average point-to-surface residuals is reduced from 1.4 cm (plain RANSAC) to 0.5 cm (threshold-independent BaySAC). The results also show that, compared to the performance of RANSAC, our BaySAC strategies lead to less iterations and cheaper computational cost when the hypothesis set is contaminated with more outliers.

  18. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chowdhary, Kenny; Najm, Habib N.

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  19. Bayesian estimation of Karhunen–Loève expansions; A random subspace approach

    DOE PAGES

    Chowdhary, Kenny; Najm, Habib N.

    2016-04-13

    One of the most widely-used statistical procedures for dimensionality reduction of high dimensional random fields is Principal Component Analysis (PCA), which is based on the Karhunen-Lo eve expansion (KLE) of a stochastic process with finite variance. The KLE is analogous to a Fourier series expansion for a random process, where the goal is to find an orthogonal transformation for the data such that the projection of the data onto this orthogonal subspace is optimal in the L 2 sense, i.e, which minimizes the mean square error. In practice, this orthogonal transformation is determined by performing an SVD (Singular Value Decomposition)more » on the sample covariance matrix or on the data matrix itself. Sampling error is typically ignored when quantifying the principal components, or, equivalently, basis functions of the KLE. Furthermore, it is exacerbated when the sample size is much smaller than the dimension of the random field. In this paper, we introduce a Bayesian KLE procedure, allowing one to obtain a probabilistic model on the principal components, which can account for inaccuracies due to limited sample size. The probabilistic model is built via Bayesian inference, from which the posterior becomes the matrix Bingham density over the space of orthonormal matrices. We use a modified Gibbs sampling procedure to sample on this space and then build a probabilistic Karhunen-Lo eve expansions over random subspaces to obtain a set of low-dimensional surrogates of the stochastic process. We illustrate this probabilistic procedure with a finite dimensional stochastic process inspired by Brownian motion.« less

  20. What Randomized Benchmarking Actually Measures

    DOE PAGES

    Proctor, Timothy; Rudinger, Kenneth; Young, Kevin; ...

    2017-09-28

    Randomized benchmarking (RB) is widely used to measure an error rate of a set of quantum gates, by performing random circuits that would do nothing if the gates were perfect. In the limit of no finite-sampling error, the exponential decay rate of the observable survival probabilities, versus circuit length, yields a single error metric r. For Clifford gates with arbitrary small errors described by process matrices, r was believed to reliably correspond to the mean, over all Clifford gates, of the average gate infidelity between the imperfect gates and their ideal counterparts. We show that this quantity is not amore » well-defined property of a physical gate set. It depends on the representations used for the imperfect and ideal gates, and the variant typically computed in the literature can differ from r by orders of magnitude. We present new theories of the RB decay that are accurate for all small errors describable by process matrices, and show that the RB decay curve is a simple exponential for all such errors. Here, these theories allow explicit computation of the error rate that RB measures (r), but as far as we can tell it does not correspond to the infidelity of a physically allowed (completely positive) representation of the imperfect gates.« less

  1. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography

    DTIC Science & Technology

    1980-03-01

    interpreting/smoothing data containing a significant percentage of gross errors, and thus is ideally suited for applications in automated image ... analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of the paper describes the application of

  2. Biases and Standard Errors of Standardized Regression Coefficients

    ERIC Educational Resources Information Center

    Yuan, Ke-Hai; Chan, Wai

    2011-01-01

    The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…

  3. Evaluation of process errors in bed load sampling using a Dune Model

    USGS Publications Warehouse

    Gomez, Basil; Troutman, Brent M.

    1997-01-01

    Reliable estimates of the streamwide bed load discharge obtained using sampling devices are dependent upon good at-a-point knowledge across the full width of the channel. Using field data and information derived from a model that describes the geometric features of a dune train in terms of a spatial process observed at a fixed point in time, we show that sampling errors decrease as the number of samples collected increases, and the number of traverses of the channel over which the samples are collected increases. It also is preferable that bed load sampling be conducted at a pace which allows a number of bed forms to pass through the sampling cross section. The situations we analyze and simulate pertain to moderate transport conditions in small rivers. In such circumstances, bed load sampling schemes typically should involve four or five traverses of a river, and the collection of 20–40 samples at a rate of five or six samples per hour. By ensuring that spatial and temporal variability in the transport process is accounted for, such a sampling design reduces both random and systematic errors and hence minimizes the total error involved in the sampling process.

  4. Predicting the random drift of MEMS gyroscope based on K-means clustering and OLS RBF Neural Network

    NASA Astrophysics Data System (ADS)

    Wang, Zhen-yu; Zhang, Li-jie

    2017-10-01

    Measure error of the sensor can be effectively compensated with prediction. Aiming at large random drift error of MEMS(Micro Electro Mechanical System))gyroscope, an improved learning algorithm of Radial Basis Function(RBF) Neural Network(NN) based on K-means clustering and Orthogonal Least-Squares (OLS) is proposed in this paper. The algorithm selects the typical samples as the initial cluster centers of RBF NN firstly, candidates centers with K-means algorithm secondly, and optimizes the candidate centers with OLS algorithm thirdly, which makes the network structure simpler and makes the prediction performance better. Experimental results show that the proposed K-means clustering OLS learning algorithm can predict the random drift of MEMS gyroscope effectively, the prediction error of which is 9.8019e-007°/s and the prediction time of which is 2.4169e-006s

  5. Error simulation of paired-comparison-based scaling methods

    NASA Astrophysics Data System (ADS)

    Cui, Chengwu

    2000-12-01

    Subjective image quality measurement usually resorts to psycho physical scaling. However, it is difficult to evaluate the inherent precision of these scaling methods. Without knowing the potential errors of the measurement, subsequent use of the data can be misleading. In this paper, the errors on scaled values derived form paired comparison based scaling methods are simulated with randomly introduced proportion of choice errors that follow the binomial distribution. Simulation results are given for various combinations of the number of stimuli and the sampling size. The errors are presented in the form of average standard deviation of the scaled values and can be fitted reasonably well with an empirical equation that can be sued for scaling error estimation and measurement design. The simulation proves paired comparison based scaling methods can have large errors on the derived scaled values when the sampling size and the number of stimuli are small. Examples are also given to show the potential errors on actually scaled values of color image prints as measured by the method of paired comparison.

  6. Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes.

    PubMed

    Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H

    2015-11-30

    We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.

  7. Random and independent sampling of endogenous tryptic peptides from normal human EDTA plasma by liquid chromatography micro electrospray ionization and tandem mass spectrometry.

    PubMed

    Dufresne, Jaimie; Florentinus-Mefailoski, Angelique; Ajambo, Juliet; Ferwa, Ammara; Bowden, Peter; Marshall, John

    2017-01-01

    Normal human EDTA plasma samples were collected on ice, processed ice cold, and stored in a freezer at - 80 °C prior to experiments. Plasma test samples from the - 80 °C freezer were thawed on ice or intentionally warmed to room temperature. Protein content was measured by CBBR binding and the release of alcohol soluble amines by the Cd ninhydrin assay. Plasma peptides released over time were collected over C18 for random and independent sampling by liquid chromatography micro electrospray ionization and tandem mass spectrometry (LC-ESI-MS/MS) and correlated with X!TANDEM. Fully tryptic peptides by X!TANDEM returned a similar set of proteins, but was more computationally efficient, than "no enzyme" correlations. Plasma samples maintained on ice, or ice with a cocktail of protease inhibitors, showed lower background amounts of plasma peptides compared to samples incubated at room temperature. Regression analysis indicated that warming plasma to room temperature, versus ice cold, resulted in a ~ twofold increase in the frequency of peptide identification over hours-days of incubation at room temperature. The type I error rate of the protein identification from the X!TANDEM algorithm combined was estimated to be low compared to a null model of computer generated random MS/MS spectra. The peptides of human plasma were identified and quantified with low error rates by random and independent sampling that revealed 1000s of peptides from hundreds of human plasma proteins from endogenous tryptic peptides.

  8. A Strategy to Use Soft Data Effectively in Randomized Controlled Clinical Trials.

    ERIC Educational Resources Information Center

    Kraemer, Helena Chmura; Thiemann, Sue

    1989-01-01

    Sees soft data, measures having substantial intrasubject variability due to errors of measurement or response inconsistency, as important measures of response in randomized clinical trials. Shows that using intensive design and slope of response on time as outcome measure maximizes sample retention and decreases within-group variability, thus…

  9. How large are the consequences of covariate imbalance in cluster randomized trials: a simulation study with a continuous outcome and a binary covariate at the cluster level.

    PubMed

    Moerbeek, Mirjam; van Schie, Sander

    2016-07-11

    The number of clusters in a cluster randomized trial is often low. It is therefore likely random assignment of clusters to treatment conditions results in covariate imbalance. There are no studies that quantify the consequences of covariate imbalance in cluster randomized trials on parameter and standard error bias and on power to detect treatment effects. The consequences of covariance imbalance in unadjusted and adjusted linear mixed models are investigated by means of a simulation study. The factors in this study are the degree of imbalance, the covariate effect size, the cluster size and the intraclass correlation coefficient. The covariate is binary and measured at the cluster level; the outcome is continuous and measured at the individual level. The results show covariate imbalance results in negligible parameter bias and small standard error bias in adjusted linear mixed models. Ignoring the possibility of covariate imbalance while calculating the sample size at the cluster level may result in a loss in power of at most 25 % in the adjusted linear mixed model. The results are more severe for the unadjusted linear mixed model: parameter biases up to 100 % and standard error biases up to 200 % may be observed. Power levels based on the unadjusted linear mixed model are often too low. The consequences are most severe for large clusters and/or small intraclass correlation coefficients since then the required number of clusters to achieve a desired power level is smallest. The possibility of covariate imbalance should be taken into account while calculating the sample size of a cluster randomized trial. Otherwise more sophisticated methods to randomize clusters to treatments should be used, such as stratification or balance algorithms. All relevant covariates should be carefully identified, be actually measured and included in the statistical model to avoid severe levels of parameter and standard error bias and insufficient power levels.

  10. Measuring Data Quality Through a Source Data Verification Audit in a Clinical Research Setting.

    PubMed

    Houston, Lauren; Probst, Yasmine; Humphries, Allison

    2015-01-01

    Health data has long been scrutinised in relation to data quality and integrity problems. Currently, no internationally accepted or "gold standard" method exists measuring data quality and error rates within datasets. We conducted a source data verification (SDV) audit on a prospective clinical trial dataset. An audit plan was applied to conduct 100% manual verification checks on a 10% random sample of participant files. A quality assurance rule was developed, whereby if >5% of data variables were incorrect a second 10% random sample would be extracted from the trial data set. Error was coded: correct, incorrect (valid or invalid), not recorded or not entered. Audit-1 had a total error of 33% and audit-2 36%. The physiological section was the only audit section to have <5% error. Data not recorded to case report forms had the greatest impact on error calculations. A significant association (p=0.00) was found between audit-1 and audit-2 and whether or not data was deemed correct or incorrect. Our study developed a straightforward method to perform a SDV audit. An audit rule was identified and error coding was implemented. Findings demonstrate that monitoring data quality by a SDV audit can identify data quality and integrity issues within clinical research settings allowing quality improvement to be made. The authors suggest this approach be implemented for future research.

  11. DNA Barcoding through Quaternary LDPC Codes

    PubMed Central

    Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar

    2015-01-01

    For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10−2 per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10−9 at the expense of a rate of read losses just in the order of 10−6. PMID:26492348

  12. DNA Barcoding through Quaternary LDPC Codes.

    PubMed

    Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar

    2015-01-01

    For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(-2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(-9) at the expense of a rate of read losses just in the order of 10(-6).

  13. Accelerating Convergence in Molecular Dynamics Simulations of Solutes in Lipid Membranes by Conducting a Random Walk along the Bilayer Normal.

    PubMed

    Neale, Chris; Madill, Chris; Rauscher, Sarah; Pomès, Régis

    2013-08-13

    All molecular dynamics simulations are susceptible to sampling errors, which degrade the accuracy and precision of observed values. The statistical convergence of simulations containing atomistic lipid bilayers is limited by the slow relaxation of the lipid phase, which can exceed hundreds of nanoseconds. These long conformational autocorrelation times are exacerbated in the presence of charged solutes, which can induce significant distortions of the bilayer structure. Such long relaxation times represent hidden barriers that induce systematic sampling errors in simulations of solute insertion. To identify optimal methods for enhancing sampling efficiency, we quantitatively evaluate convergence rates using generalized ensemble sampling algorithms in calculations of the potential of mean force for the insertion of the ionic side chain analog of arginine in a lipid bilayer. Umbrella sampling (US) is used to restrain solute insertion depth along the bilayer normal, the order parameter commonly used in simulations of molecular solutes in lipid bilayers. When US simulations are modified to conduct random walks along the bilayer normal using a Hamiltonian exchange algorithm, systematic sampling errors are eliminated more rapidly and the rate of statistical convergence of the standard free energy of binding of the solute to the lipid bilayer is increased 3-fold. We compute the ratio of the replica flux transmitted across a defined region of the order parameter to the replica flux that entered that region in Hamiltonian exchange simulations. We show that this quantity, the transmission factor, identifies sampling barriers in degrees of freedom orthogonal to the order parameter. The transmission factor is used to estimate the depth-dependent conformational autocorrelation times of the simulation system, some of which exceed the simulation time, and thereby identify solute insertion depths that are prone to systematic sampling errors and estimate the lower bound of the amount of sampling that is required to resolve these sampling errors. Finally, we extend our simulations and verify that the conformational autocorrelation times estimated by the transmission factor accurately predict correlation times that exceed the simulation time scale-something that, to our knowledge, has never before been achieved.

  14. Comparison of Parametric and Nonparametric Bootstrap Methods for Estimating Random Error in Equipercentile Equating

    ERIC Educational Resources Information Center

    Cui, Zhongmin; Kolen, Michael J.

    2008-01-01

    This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…

  15. The Impact of Short-Term Science Teacher Professional Development on the Evaluation of Student Understanding and Errors Related to Natural Selection

    ERIC Educational Resources Information Center

    Buschang, Rebecca Ellen

    2012-01-01

    This study evaluated the effects of a short-term professional development session. Forty volunteer high school biology teachers were randomly assigned to one of two professional development conditions: (a) developing deep content knowledge (i.e., control condition) or (b) evaluating student errors and understanding in writing samples (i.e.,…

  16. The utility of point count surveys to predict wildlife interactions with wind energy facilities: An example focused on golden eagles

    USGS Publications Warehouse

    Sur, Maitreyi; Belthoff, James R.; Bjerre, Emily R.; Millsap, Brian A.; Katzner, Todd

    2018-01-01

    Wind energy development is rapidly expanding in North America, often accompanied by requirements to survey potential facility locations for existing wildlife. Within the USA, golden eagles (Aquila chrysaetos) are among the most high-profile species of birds that are at risk from wind turbines. To minimize golden eagle fatalities in areas proposed for wind development, modified point count surveys are usually conducted to estimate use by these birds. However, it is not always clear what drives variation in the relationship between on-site point count data and actual use by eagles of a wind energy project footprint. We used existing GPS-GSM telemetry data, collected at 15 min intervals from 13 golden eagles in 2012 and 2013, to explore the relationship between point count data and eagle use of an entire project footprint. To do this, we overlaid the telemetry data on hypothetical project footprints and simulated a variety of point count sampling strategies for those footprints. We compared the time an eagle was found in the sample plots with the time it was found in the project footprint using a metric we called “error due to sampling”. Error due to sampling for individual eagles appeared to be influenced by interactions between the size of the project footprint (20, 40, 90 or 180 km2) and the sampling type (random, systematic or stratified) and was greatest on 90 km2 plots. However, use of random sampling resulted in lowest error due to sampling within intermediate sized plots. In addition sampling intensity and sampling frequency both influenced the effectiveness of point count sampling. Although our work focuses on individual eagles (not the eagle populations typically surveyed in the field), our analysis shows both the utility of simulations to identify specific influences on error and also potential improvements to sampling that consider the context-specific manner that point counts are laid out on the landscape.

  17. Sampling Errors of SSM/I and TRMM Rainfall Averages: Comparison with Error Estimates from Surface Data and a Sample Model

    NASA Technical Reports Server (NTRS)

    Bell, Thomas L.; Kundu, Prasun K.; Kummerow, Christian D.; Einaudi, Franco (Technical Monitor)

    2000-01-01

    Quantitative use of satellite-derived maps of monthly rainfall requires some measure of the accuracy of the satellite estimates. The rainfall estimate for a given map grid box is subject to both remote-sensing error and, in the case of low-orbiting satellites, sampling error due to the limited number of observations of the grid box provided by the satellite. A simple model of rain behavior predicts that Root-mean-square (RMS) random error in grid-box averages should depend in a simple way on the local average rain rate, and the predicted behavior has been seen in simulations using surface rain-gauge and radar data. This relationship was examined using satellite SSM/I data obtained over the western equatorial Pacific during TOGA COARE. RMS error inferred directly from SSM/I rainfall estimates was found to be larger than predicted from surface data, and to depend less on local rain rate than was predicted. Preliminary examination of TRMM microwave estimates shows better agreement with surface data. A simple method of estimating rms error in satellite rainfall estimates is suggested, based on quantities that can be directly computed from the satellite data.

  18. The External Quality Assessment Scheme (EQAS): Experiences of a medium sized accredited laboratory.

    PubMed

    Bhat, Vivek; Chavan, Preeti; Naresh, Chital; Poladia, Pratik

    2015-06-15

    We put forth our experiences of EQAS, analyzed the result discrepancies, reviewed the corrective actions and also put forth strategies for risk identification and prevention of potential errors in a medical laboratory. For hematology, EQAS samples - blood, peripheral and reticulocyte smears - were received quarterly every year. All the blood samples were processed on HMX hematology analyzer by Beckman-Coulter. For clinical chemistry, lyophilized samples were received and were processed on Siemens Dimension Xpand and RXL analyzers. For microbiology, EQAS samples were received quarterly every year as lyophilized strains along with smears and serological samples. In hematology no outliers were noted for reticulocyte and peripheral smear examination. Only one outlier was noted for CBC. In clinical chemistry outliers (SDI ≥ 2) were noted in 7 samples (23 parameters) out of total 36 samples (756 parameters) processed. Thirteen of these parameters were analyzed as random errors, 3 as transcriptional errors and seven instances of systemic error were noted. In microbiology, one discrepancy was noted in isolate identification and in the grading of smears for AFB by Ziehl Neelsen stain. EQAS along with IQC is a very important tool for maintaining optimal quality of services. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. The Use of Compressive Sensing to Reconstruct Radiation Characteristics of Wide-Band Antennas from Sparse Measurements

    DTIC Science & Technology

    2015-06-01

    of uniform- versus nonuniform -pattern reconstruction, of transform function used, and of minimum randomly distributed measurements needed to...the radiation-frequency pattern’s reconstruction using uniform and nonuniform randomly distributed samples even though the pattern error manifests...5 Fig. 3 The nonuniform compressive-sensing reconstruction of the radiation

  20. Mapping ecological systems with a random foret model: tradeoffs between errors and bias

    Treesearch

    Emilie Grossmann; Janet Ohmann; James Kagan; Heather May; Matthew Gregory

    2010-01-01

    New methods for predictive vegetation mapping allow improved estimations of plant community composition across large regions. Random Forest (RF) models limit over-fitting problems of other methods, and are known for making accurate classification predictions from noisy, nonnormal data, but can be biased when plot samples are unbalanced. We developed two contrasting...

  1. An audit strategy for time-to-event outcomes measured with error: application to five randomized controlled trials in oncology.

    PubMed

    Dodd, Lori E; Korn, Edward L; Freidlin, Boris; Gu, Wenjuan; Abrams, Jeffrey S; Bushnell, William D; Canetta, Renzo; Doroshow, James H; Gray, Robert J; Sridhara, Rajeshwari

    2013-10-01

    Measurement error in time-to-event end points complicates interpretation of treatment effects in clinical trials. Non-differential measurement error is unlikely to produce large bias [1]. When error depends on treatment arm, bias is of greater concern. Blinded-independent central review (BICR) of all images from a trial is commonly undertaken to mitigate differential measurement-error bias that may be present in hazard ratios (HRs) based on local evaluations. Similar BICR and local evaluation HRs may provide reassurance about the treatment effect, but BICR adds considerable time and expense to trials. We describe a BICR audit strategy [2] and apply it to five randomized controlled trials to evaluate its use and to provide practical guidelines. The strategy requires BICR on a subset of study subjects, rather than a complete-case BICR, and makes use of an auxiliary-variable estimator. When the effect size is relatively large, the method provides a substantial reduction in the size of the BICRs. In a trial with 722 participants and a HR of 0.48, an average audit of 28% of the data was needed and always confirmed the treatment effect as assessed by local evaluations. More moderate effect sizes and/or smaller trial sizes required larger proportions of audited images, ranging from 57% to 100% for HRs ranging from 0.55 to 0.77 and sample sizes between 209 and 737. The method is developed for a simple random sample of study subjects. In studies with low event rates, more efficient estimation may result from sampling individuals with events at a higher rate. The proposed strategy can greatly decrease the costs and time associated with BICR, by reducing the number of images undergoing review. The savings will depend on the underlying treatment effect and trial size, with larger treatment effects and larger trials requiring smaller proportions of audited data.

  2. Decorrelation of the true and estimated classifier errors in high-dimensional settings.

    PubMed

    Hanczar, Blaise; Hua, Jianping; Dougherty, Edward R

    2007-01-01

    The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error), in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting). Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, k-fold cross-validation, and .632 bootstrap). Moreover, three scenarios are considered: (1) feature selection, (2) known-feature set, and (3) all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection or using all features, with the better correlation between the latter two showing no general trend, but differing for different models.

  3. Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

    PubMed

    Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

    2005-01-01

    Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.

  4. Design and simulation study of the immunization Data Quality Audit (DQA).

    PubMed

    Woodard, Stacy; Archer, Linda; Zell, Elizabeth; Ronveaux, Olivier; Birmingham, Maureen

    2007-08-01

    The goal of the Data Quality Audit (DQA) is to assess whether the Global Alliance for Vaccines and Immunization-funded countries are adequately reporting the number of diphtheria-tetanus-pertussis immunizations given, on which the "shares" are awarded. Given that this sampling design is a modified two-stage cluster sample (modified because a stratified, rather than a simple, random sample of health facilities is obtained from the selected clusters); the formula for the calculation of the standard error for the estimate is unknown. An approximated standard error has been proposed, and the first goal of this simulation is to assess the accuracy of the standard error. Results from the simulations based on hypothetical populations were found not to be representative of the actual DQAs that were conducted. Additional simulations were then conducted on the actual DQA data to better access the precision of the DQ with both the original and the increased sample sizes.

  5. Precipitation and Latent Heating Distributions from Satellite Passive Microwave Radiometry. Part 1; Method and Uncertainties

    NASA Technical Reports Server (NTRS)

    Olson, William S.; Kummerow, Christian D.; Yang, Song; Petty, Grant W.; Tao, Wei-Kuo; Bell, Thomas L.; Braun, Scott A.; Wang, Yansen; Lang, Stephen E.; Johnson, Daniel E.

    2004-01-01

    A revised Bayesian algorithm for estimating surface rain rate, convective rain proportion, and latent heating/drying profiles from satellite-borne passive microwave radiometer observations over ocean backgrounds is described. The algorithm searches a large database of cloud-radiative model simulations to find cloud profiles that are radiatively consistent with a given set of microwave radiance measurements. The properties of these radiatively consistent profiles are then composited to obtain best estimates of the observed properties. The revised algorithm is supported by an expanded and more physically consistent database of cloud-radiative model simulations. The algorithm also features a better quantification of the convective and non-convective contributions to total rainfall, a new geographic database, and an improved representation of background radiances in rain-free regions. Bias and random error estimates are derived from applications of the algorithm to synthetic radiance data, based upon a subset of cloud resolving model simulations, and from the Bayesian formulation itself. Synthetic rain rate and latent heating estimates exhibit a trend of high (low) bias for low (high) retrieved values. The Bayesian estimates of random error are propagated to represent errors at coarser time and space resolutions, based upon applications of the algorithm to TRMM Microwave Imager (TMI) data. Errors in instantaneous rain rate estimates at 0.5 deg resolution range from approximately 50% at 1 mm/h to 20% at 14 mm/h. These errors represent about 70-90% of the mean random deviation between collocated passive microwave and spaceborne radar rain rate estimates. The cumulative algorithm error in TMI estimates at monthly, 2.5 deg resolution is relatively small (less than 6% at 5 mm/day) compared to the random error due to infrequent satellite temporal sampling (8-35% at the same rain rate).

  6. Experiential Teaching Increases Medication Calculation Accuracy Among Baccalaureate Nursing Students.

    PubMed

    Hurley, Teresa V

    Safe medication administration is an international goal. Calculation errors cause patient harm despite education. The research purpose was to evaluate the effectiveness of an experiential teaching strategy to reduce errors in a sample of 78 baccalaureate nursing students at a Northeastern college. A pretest-posttest design with random assignment into equal-sized groups was used. The experiential strategy was more effective than the traditional method (t = -0.312, df = 37, p = .004, 95% CI) with a reduction in calculation errors. Evaluations of error type and teaching strategies are indicated to facilitate course and program changes.

  7. A technique for evaluating the influence of spatial sampling on the determination of global mean total columnar ozone

    NASA Technical Reports Server (NTRS)

    Tolson, R. H.

    1981-01-01

    A technique is described for providing a means of evaluating the influence of spatial sampling on the determination of global mean total columnar ozone. A finite number of coefficients in the expansion are determined, and the truncated part of the expansion is shown to contribute an error to the estimate, which depends strongly on the spatial sampling and is relatively insensitive to data noise. First and second order statistics are derived for each term in a spherical harmonic expansion which represents the ozone field, and the statistics are used to estimate systematic and random errors in the estimates of total ozone.

  8. Role of turbulence fluctuations on uncertainties of acoutic Doppler current profiler discharge measurements

    USGS Publications Warehouse

    Tarrab, Leticia; Garcia, Carlos M.; Cantero, Mariano I.; Oberg, Kevin

    2012-01-01

    This work presents a systematic analysis quantifying the role of the presence of turbulence fluctuations on uncertainties (random errors) of acoustic Doppler current profiler (ADCP) discharge measurements from moving platforms. Data sets of three-dimensional flow velocities with high temporal and spatial resolution were generated from direct numerical simulation (DNS) of turbulent open channel flow. Dimensionless functions relating parameters quantifying the uncertainty in discharge measurements due to flow turbulence (relative variance and relative maximum random error) to sampling configuration were developed from the DNS simulations and then validated with field-scale discharge measurements. The validated functions were used to evaluate the role of the presence of flow turbulence fluctuations on uncertainties in ADCP discharge measurements. The results of this work indicate that random errors due to the flow turbulence are significant when: (a) a low number of transects is used for a discharge measurement, and (b) measurements are made in shallow rivers using high boat velocity (short time for the boat to cross a flow turbulence structure).

  9. The Number of Patients and Events Required to Limit the Risk of Overestimation of Intervention Effects in Meta-Analysis—A Simulation Study

    PubMed Central

    Thorlund, Kristian; Imberger, Georgina; Walsh, Michael; Chu, Rong; Gluud, Christian; Wetterslev, Jørn; Guyatt, Gordon; Devereaux, Philip J.; Thabane, Lehana

    2011-01-01

    Background Meta-analyses including a limited number of patients and events are prone to yield overestimated intervention effect estimates. While many assume bias is the cause of overestimation, theoretical considerations suggest that random error may be an equal or more frequent cause. The independent impact of random error on meta-analyzed intervention effects has not previously been explored. It has been suggested that surpassing the optimal information size (i.e., the required meta-analysis sample size) provides sufficient protection against overestimation due to random error, but this claim has not yet been validated. Methods We simulated a comprehensive array of meta-analysis scenarios where no intervention effect existed (i.e., relative risk reduction (RRR) = 0%) or where a small but possibly unimportant effect existed (RRR = 10%). We constructed different scenarios by varying the control group risk, the degree of heterogeneity, and the distribution of trial sample sizes. For each scenario, we calculated the probability of observing overestimates of RRR>20% and RRR>30% for each cumulative 500 patients and 50 events. We calculated the cumulative number of patients and events required to reduce the probability of overestimation of intervention effect to 10%, 5%, and 1%. We calculated the optimal information size for each of the simulated scenarios and explored whether meta-analyses that surpassed their optimal information size had sufficient protection against overestimation of intervention effects due to random error. Results The risk of overestimation of intervention effects was usually high when the number of patients and events was small and this risk decreased exponentially over time as the number of patients and events increased. The number of patients and events required to limit the risk of overestimation depended considerably on the underlying simulation settings. Surpassing the optimal information size generally provided sufficient protection against overestimation. Conclusions Random errors are a frequent cause of overestimation of intervention effects in meta-analyses. Surpassing the optimal information size will provide sufficient protection against overestimation. PMID:22028777

  10. The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy.

    PubMed

    Rodgers, J L

    1999-10-01

    A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.

  11. The Impact of Short-Term Science Teacher Professional Development on the Evaluation of Student Understanding and Errors Related to Natural Selection. CRESST Report 822

    ERIC Educational Resources Information Center

    Buschang, Rebecca E.

    2012-01-01

    This study evaluated the effects of a short-term professional development session. Forty volunteer high school biology teachers were randomly assigned to one of two professional development conditions: (a) developing deep content knowledge (i.e., control condition) or (b) evaluating student errors and understanding in writing samples (i.e.,…

  12. Adjusting for multiple prognostic factors in the analysis of randomised trials

    PubMed Central

    2013-01-01

    Background When multiple prognostic factors are adjusted for in the analysis of a randomised trial, it is unclear (1) whether it is necessary to account for each of the strata, formed by all combinations of the prognostic factors (stratified analysis), when randomisation has been balanced within each stratum (stratified randomisation), or whether adjusting for the main effects alone will suffice, and (2) the best method of adjustment in terms of type I error rate and power, irrespective of the randomisation method. Methods We used simulation to (1) determine if a stratified analysis is necessary after stratified randomisation, and (2) to compare different methods of adjustment in terms of power and type I error rate. We considered the following methods of analysis: adjusting for covariates in a regression model, adjusting for each stratum using either fixed or random effects, and Mantel-Haenszel or a stratified Cox model depending on outcome. Results Stratified analysis is required after stratified randomisation to maintain correct type I error rates when (a) there are strong interactions between prognostic factors, and (b) there are approximately equal number of patients in each stratum. However, simulations based on real trial data found that type I error rates were unaffected by the method of analysis (stratified vs unstratified), indicating these conditions were not met in real datasets. Comparison of different analysis methods found that with small sample sizes and a binary or time-to-event outcome, most analysis methods lead to either inflated type I error rates or a reduction in power; the lone exception was a stratified analysis using random effects for strata, which gave nominal type I error rates and adequate power. Conclusions It is unlikely that a stratified analysis is necessary after stratified randomisation except in extreme scenarios. Therefore, the method of analysis (accounting for the strata, or adjusting only for the covariates) will not generally need to depend on the method of randomisation used. Most methods of analysis work well with large sample sizes, however treating strata as random effects should be the analysis method of choice with binary or time-to-event outcomes and a small sample size. PMID:23898993

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Proctor, Timothy; Rudinger, Kenneth; Young, Kevin

    Randomized benchmarking (RB) is widely used to measure an error rate of a set of quantum gates, by performing random circuits that would do nothing if the gates were perfect. In the limit of no finite-sampling error, the exponential decay rate of the observable survival probabilities, versus circuit length, yields a single error metric r. For Clifford gates with arbitrary small errors described by process matrices, r was believed to reliably correspond to the mean, over all Clifford gates, of the average gate infidelity between the imperfect gates and their ideal counterparts. We show that this quantity is not amore » well-defined property of a physical gate set. It depends on the representations used for the imperfect and ideal gates, and the variant typically computed in the literature can differ from r by orders of magnitude. We present new theories of the RB decay that are accurate for all small errors describable by process matrices, and show that the RB decay curve is a simple exponential for all such errors. Here, these theories allow explicit computation of the error rate that RB measures (r), but as far as we can tell it does not correspond to the infidelity of a physically allowed (completely positive) representation of the imperfect gates.« less

  14. Statistical considerations in evaluating pharmacogenomics-based clinical effect for confirmatory trials.

    PubMed

    Wang, Sue-Jane; O'Neill, Robert T; Hung, Hm James

    2010-10-01

    The current practice for seeking genomically favorable patients in randomized controlled clinical trials using genomic convenience samples. To discuss the extent of imbalance, confounding, bias, design efficiency loss, type I error, and type II error that can occur in the evaluation of the convenience samples, particularly when they are small samples. To articulate statistical considerations for a reasonable sample size to minimize the chance of imbalance, and, to highlight the importance of replicating the subgroup finding in independent studies. Four case examples reflecting recent regulatory experiences are used to underscore the problems with convenience samples. Probability of imbalance for a pre-specified subgroup is provided to elucidate sample size needed to minimize the chance of imbalance. We use an example drug development to highlight the level of scientific rigor needed, with evidence replicated for a pre-specified subgroup claim. The convenience samples evaluated ranged from 18% to 38% of the intent-to-treat samples with sample size ranging from 100 to 5000 patients per arm. The baseline imbalance can occur with probability higher than 25%. Mild to moderate multiple confounders yielding the same directional bias in favor of the treated group can make treatment group incomparable at baseline and result in a false positive conclusion that there is a treatment difference. Conversely, if the same directional bias favors the placebo group or there is loss in design efficiency, the type II error can increase substantially. Pre-specification of a genomic subgroup hypothesis is useful only for some degree of type I error control. Complete ascertainment of genomic samples in a randomized controlled trial should be the first step to explore if a favorable genomic patient subgroup suggests a treatment effect when there is no clear prior knowledge and understanding about how the mechanism of a drug target affects the clinical outcome of interest. When stratified randomization based on genomic biomarker status cannot be implemented in designing a pharmacogenomics confirmatory clinical trial, if there is one genomic biomarker prognostic for clinical response, as a general rule of thumb, a sample size of at least 100 patients may be needed to be considered for the lower prevalence genomic subgroup to minimize the chance of an imbalance of 20% or more difference in the prevalence of the genomic marker. The sample size may need to be at least 150, 350, and 1350, respectively, if an imbalance of 15%, 10% and 5% difference is of concern.

  15. Multi-kW coherent combining of fiber lasers seeded with pseudo random phase modulated light

    NASA Astrophysics Data System (ADS)

    Flores, Angel; Ehrehreich, Thomas; Holten, Roger; Anderson, Brian; Dajani, Iyad

    2016-03-01

    We report efficient coherent beam combining of five kilowatt-class fiber amplifiers with a diffractive optical element (DOE). Based on a master oscillator power amplifier (MOPA) configuration, the amplifiers were seeded with pseudo random phase modulated light. Each non-polarization maintaining fiber amplifier was optically path length matched and provides approximately 1.2 kW of near diffraction-limited output power (measured M2<1.1). Consequently, a low power sample of each laser was utilized for active linear polarization control. A low power sample of the combined beam after the DOE provided an error signal for active phase locking which was performed via Locking of Optical Coherence by Single-Detector Electronic-Frequency Tagging (LOCSET). After phase stabilization, the beams were coherently combined via the 1x5 DOE. A total combined output power of 4.9 kW was achieved with 82% combining efficiency and excellent beam quality (M2<1.1). The intrinsic DOE splitter loss was 5%. Similarly, losses due in part to non-ideal polarization, ASE content, uncorrelated wavefront errors, and misalignment errors contributed to the efficiency reduction.

  16. Two-sample binary phase 2 trials with low type I error and low sample size

    PubMed Central

    Litwin, Samuel; Basickes, Stanley; Ross, Eric A.

    2017-01-01

    Summary We address design of two-stage clinical trials comparing experimental and control patients. Our end-point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p0 and alternative that it is p0 among controls and p1 > p0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E, sufficiently exceeds C, that among (C)ontrols. Here, we combine one-sample rejection decision rules, E ≥ m, with two-sample rules of the form E – C > r to achieve two-sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two-sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. PMID:28118686

  17. Physical layer one-time-pad data encryption through synchronized semiconductor laser networks

    NASA Astrophysics Data System (ADS)

    Argyris, Apostolos; Pikasis, Evangelos; Syvridis, Dimitris

    2016-02-01

    Semiconductor lasers (SL) have been proven to be a key device in the generation of ultrafast true random bit streams. Their potential to emit chaotic signals under conditions with desirable statistics, establish them as a low cost solution to cover various needs, from large volume key generation to real-time encrypted communications. Usually, only undemanding post-processing is needed to convert the acquired analog timeseries to digital sequences that pass all established tests of randomness. A novel architecture that can generate and exploit these true random sequences is through a fiber network in which the nodes are semiconductor lasers that are coupled and synchronized to central hub laser. In this work we show experimentally that laser nodes in such a star network topology can synchronize with each other through complex broadband signals that are the seed to true random bit sequences (TRBS) generated at several Gb/s. The potential for each node to access real-time generated and synchronized with the rest of the nodes random bit streams, through the fiber optic network, allows to implement an one-time-pad encryption protocol that mixes the synchronized true random bit sequence with real data at Gb/s rates. Forward-error correction methods are used to reduce the errors in the TRBS and the final error rate at the data decoding level. An appropriate selection in the sampling methodology and properties, as well as in the physical properties of the chaotic seed signal through which network locks in synchronization, allows an error free performance.

  18. Combined Uncertainty and A-Posteriori Error Bound Estimates for General CFD Calculations: Theory and Software Implementation

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2014-01-01

    This workshop presentation discusses the design and implementation of numerical methods for the quantification of statistical uncertainty, including a-posteriori error bounds, for output quantities computed using CFD methods. Hydrodynamic realizations often contain numerical error arising from finite-dimensional approximation (e.g. numerical methods using grids, basis functions, particles) and statistical uncertainty arising from incomplete information and/or statistical characterization of model parameters and random fields. The first task at hand is to derive formal error bounds for statistics given realizations containing finite-dimensional numerical error [1]. The error in computed output statistics contains contributions from both realization error and the error resulting from the calculation of statistics integrals using a numerical method. A second task is to devise computable a-posteriori error bounds by numerically approximating all terms arising in the error bound estimates. For the same reason that CFD calculations including error bounds but omitting uncertainty modeling are only of limited value, CFD calculations including uncertainty modeling but omitting error bounds are only of limited value. To gain maximum value from CFD calculations, a general software package for uncertainty quantification with quantified error bounds has been developed at NASA. The package provides implementations for a suite of numerical methods used in uncertainty quantification: Dense tensorization basis methods [3] and a subscale recovery variant [1] for non-smooth data, Sparse tensorization methods[2] utilizing node-nested hierarchies, Sampling methods[4] for high-dimensional random variable spaces.

  19. Estimation After a Group Sequential Trial.

    PubMed

    Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert

    2015-10-01

    Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

  20. Sampling for mercury at subnanogram per litre concentrations for load estimation in rivers

    USGS Publications Warehouse

    Colman, J.A.; Breault, R.F.

    2000-01-01

    Estimation of constituent loads in streams requires collection of stream samples that are representative of constituent concentrations, that is, composites of isokinetic multiple verticals collected along a stream transect. An all-Teflon isokinetic sampler (DH-81) cleaned in 75??C, 4 N HCl was tested using blank, split, and replicate samples to assess systematic and random sample contamination by mercury species. Mean mercury concentrations in field-equipment blanks were low: 0.135 ng??L-1 for total mercury (??Hg) and 0.0086 ng??L-1 for monomethyl mercury (MeHg). Mean square errors (MSE) for ??Hg and MeHg duplicate samples collected at eight sampling stations were not statistically different from MSE of samples split in the laboratory, which represent the analytical and splitting error. Low fieldblank concentrations and statistically equal duplicate- and split-sample MSE values indicate that no measurable contamination was occurring during sampling. Standard deviations associated with example mercury load estimations were four to five times larger, on a relative basis, than standard deviations calculated from duplicate samples, indicating that error of the load determination was primarily a function of the loading model used, not of sampling or analytical methods.

  1. Pressing the Approach: A NASA Study of 19 Recent Accidents Yields a New Perspective on Pilot Error

    NASA Technical Reports Server (NTRS)

    Berman, Benjamin A.; Dismukes, R. Key

    2007-01-01

    This article begins with a review of two sample airplane accidents that were caused by pilot error. The analysis of these and 17 other accidents suggested that almost all experienced pilot operating in the same environment in which the accident crews were operating and knowing only what the accident crews knew at each moment of the flight, would be vulnerable to making a similar decision and similar errors. Whether a particular crew in a given situation makes errors depends on somewhat random interaction of factors. Two themes that seem to be prevalent in these cases are: Plan Continuation Bias, and Snowballing Workload.

  2. Distributional assumptions in food and feed commodities- development of fit-for-purpose sampling protocols.

    PubMed

    Paoletti, Claudia; Esbensen, Kim H

    2015-01-01

    Material heterogeneity influences the effectiveness of sampling procedures. Most sampling guidelines used for assessment of food and/or feed commodities are based on classical statistical distribution requirements, the normal, binomial, and Poisson distributions-and almost universally rely on the assumption of randomness. However, this is unrealistic. The scientific food and feed community recognizes a strong preponderance of non random distribution within commodity lots, which should be a more realistic prerequisite for definition of effective sampling protocols. Nevertheless, these heterogeneity issues are overlooked as the prime focus is often placed only on financial, time, equipment, and personnel constraints instead of mandating acquisition of documented representative samples under realistic heterogeneity conditions. This study shows how the principles promulgated in the Theory of Sampling (TOS) and practically tested over 60 years provide an effective framework for dealing with the complete set of adverse aspects of both compositional and distributional heterogeneity (material sampling errors), as well as with the errors incurred by the sampling process itself. The results of an empirical European Union study on genetically modified soybean heterogeneity, Kernel Lot Distribution Assessment are summarized, as they have a strong bearing on the issue of proper sampling protocol development. TOS principles apply universally in the food and feed realm and must therefore be considered the only basis for development of valid sampling protocols free from distributional constraints.

  3. Irregular analytical errors in diagnostic testing - a novel concept.

    PubMed

    Vogeser, Michael; Seger, Christoph

    2018-02-23

    In laboratory medicine, routine periodic analyses for internal and external quality control measurements interpreted by statistical methods are mandatory for batch clearance. Data analysis of these process-oriented measurements allows for insight into random analytical variation and systematic calibration bias over time. However, in such a setting, any individual sample is not under individual quality control. The quality control measurements act only at the batch level. Quantitative or qualitative data derived for many effects and interferences associated with an individual diagnostic sample can compromise any analyte. It is obvious that a process for a quality-control-sample-based approach of quality assurance is not sensitive to such errors. To address the potential causes and nature of such analytical interference in individual samples more systematically, we suggest the introduction of a new term called the irregular (individual) analytical error. Practically, this term can be applied in any analytical assay that is traceable to a reference measurement system. For an individual sample an irregular analytical error is defined as an inaccuracy (which is the deviation from a reference measurement procedure result) of a test result that is so high it cannot be explained by measurement uncertainty of the utilized routine assay operating within the accepted limitations of the associated process quality control measurements. The deviation can be defined as the linear combination of the process measurement uncertainty and the method bias for the reference measurement system. Such errors should be coined irregular analytical errors of the individual sample. The measurement result is compromised either by an irregular effect associated with the individual composition (matrix) of the sample or an individual single sample associated processing error in the analytical process. Currently, the availability of reference measurement procedures is still highly limited, but LC-isotope-dilution mass spectrometry methods are increasingly used for pre-market validation of routine diagnostic assays (these tests also involve substantial sets of clinical validation samples). Based on this definition/terminology, we list recognized causes of irregular analytical error as a risk catalog for clinical chemistry in this article. These issues include reproducible individual analytical errors (e.g. caused by anti-reagent antibodies) and non-reproducible, sporadic errors (e.g. errors due to incorrect pipetting volume due to air bubbles in a sample), which can both lead to inaccurate results and risks for patients.

  4. Jackknifing Techniques for Evaluation of Equating Accuracy. Research Report. ETS RR-09-39

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Lee, Yi-Hsuan; Qian, Jiahe

    2009-01-01

    Grouped jackknifing may be used to evaluate the stability of equating procedures with respect to sampling error and with respect to changes in anchor selection. Properties of grouped jackknifing are reviewed for simple-random and stratified sampling, and its use is described for comparisons of anchor sets. Application is made to examples of item…

  5. Hazard Function Estimation with Cause-of-Death Data Missing at Random.

    PubMed

    Wang, Qihua; Dinse, Gregg E; Liu, Chunling

    2012-04-01

    Hazard function estimation is an important part of survival analysis. Interest often centers on estimating the hazard function associated with a particular cause of death. We propose three nonparametric kernel estimators for the hazard function, all of which are appropriate when death times are subject to random censorship and censoring indicators can be missing at random. Specifically, we present a regression surrogate estimator, an imputation estimator, and an inverse probability weighted estimator. All three estimators are uniformly strongly consistent and asymptotically normal. We derive asymptotic representations of the mean squared error and the mean integrated squared error for these estimators and we discuss a data-driven bandwidth selection method. A simulation study, conducted to assess finite sample behavior, demonstrates that the proposed hazard estimators perform relatively well. We illustrate our methods with an analysis of some vascular disease data.

  6. Gram-negative and -positive bacteria differentiation in blood culture samples by headspace volatile compound analysis.

    PubMed

    Dolch, Michael E; Janitza, Silke; Boulesteix, Anne-Laure; Graßmann-Lichtenauer, Carola; Praun, Siegfried; Denzer, Wolfgang; Schelling, Gustav; Schubert, Sören

    2016-12-01

    Identification of microorganisms in positive blood cultures still relies on standard techniques such as Gram staining followed by culturing with definite microorganism identification. Alternatively, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry or the analysis of headspace volatile compound (VC) composition produced by cultures can help to differentiate between microorganisms under experimental conditions. This study assessed the efficacy of volatile compound based microorganism differentiation into Gram-negatives and -positives in unselected positive blood culture samples from patients. Headspace gas samples of positive blood culture samples were transferred to sterilized, sealed, and evacuated 20 ml glass vials and stored at -30 °C until batch analysis. Headspace gas VC content analysis was carried out via an auto sampler connected to an ion-molecule reaction mass spectrometer (IMR-MS). Measurements covered a mass range from 16 to 135 u including CO2, H2, N2, and O2. Prediction rules for microorganism identification based on VC composition were derived using a training data set and evaluated using a validation data set within a random split validation procedure. One-hundred-fifty-two aerobic samples growing 27 Gram-negatives, 106 Gram-positives, and 19 fungi and 130 anaerobic samples growing 37 Gram-negatives, 91 Gram-positives, and two fungi were analysed. In anaerobic samples, ten discriminators were identified by the random forest method allowing for bacteria differentiation into Gram-negative and -positive (error rate: 16.7 % in validation data set). For aerobic samples the error rate was not better than random. In anaerobic blood culture samples of patients IMR-MS based headspace VC composition analysis facilitates bacteria differentiation into Gram-negative and -positive.

  7. Enhancing adaptive sparse grid approximations and improving refinement strategies using adjoint-based a posteriori error estimates

    DOE PAGES

    Jakeman, J. D.; Wildey, T.

    2015-01-01

    In this paper we present an algorithm for adaptive sparse grid approximations of quantities of interest computed from discretized partial differential equations. We use adjoint-based a posteriori error estimates of the interpolation error in the sparse grid to enhance the sparse grid approximation and to drive adaptivity. We show that utilizing these error estimates provides significantly more accurate functional values for random samples of the sparse grid approximation. We also demonstrate that alternative refinement strategies based upon a posteriori error estimates can lead to further increases in accuracy in the approximation over traditional hierarchical surplus based strategies. Throughout this papermore » we also provide and test a framework for balancing the physical discretization error with the stochastic interpolation error of the enhanced sparse grid approximation.« less

  8. Two-sample binary phase 2 trials with low type I error and low sample size.

    PubMed

    Litwin, Samuel; Basickes, Stanley; Ross, Eric A

    2017-04-30

    We address design of two-stage clinical trials comparing experimental and control patients. Our end point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p 0 and alternative that it is p 0 among controls and p 1  > p 0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E, sufficiently exceeds C, that among (C)ontrols. Here, we combine one-sample rejection decision rules, E⩾m, with two-sample rules of the form E - C > r to achieve two-sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two-sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  9. Measurements of stem diameter: implications for individual- and stand-level errors.

    PubMed

    Paul, Keryn I; Larmour, John S; Roxburgh, Stephen H; England, Jacqueline R; Davies, Micah J; Luck, Hamish D

    2017-08-01

    Stem diameter is one of the most common measurements made to assess the growth of woody vegetation, and the commercial and environmental benefits that it provides (e.g. wood or biomass products, carbon sequestration, landscape remediation). Yet inconsistency in its measurement is a continuing source of error in estimates of stand-scale measures such as basal area, biomass, and volume. Here we assessed errors in stem diameter measurement through repeated measurements of individual trees and shrubs of varying size and form (i.e. single- and multi-stemmed) across a range of contrasting stands, from complex mixed-species plantings to commercial single-species plantations. We compared a standard diameter tape with a Stepped Diameter Gauge (SDG) for time efficiency and measurement error. Measurement errors in diameter were slightly (but significantly) influenced by size and form of the tree or shrub, and stem height at which the measurement was made. Compared to standard tape measurement, the mean systematic error with SDG measurement was only -0.17 cm, but varied between -0.10 and -0.52 cm. Similarly, random error was relatively large, with standard deviations (and percentage coefficients of variation) averaging only 0.36 cm (and 3.8%), but varying between 0.14 and 0.61 cm (and 1.9 and 7.1%). However, at the stand scale, sampling errors (i.e. how well individual trees or shrubs selected for measurement of diameter represented the true stand population in terms of the average and distribution of diameter) generally had at least a tenfold greater influence on random errors in basal area estimates than errors in diameter measurements. This supports the use of diameter measurement tools that have high efficiency, such as the SDG. Use of the SDG almost halved the time required for measurements compared to the diameter tape. Based on these findings, recommendations include the following: (i) use of a tape to maximise accuracy when developing allometric models, or when monitoring relatively small changes in permanent sample plots (e.g. National Forest Inventories), noting that care is required in irregular-shaped, large-single-stemmed individuals, and (ii) use of a SDG to maximise efficiency when using inventory methods to assess basal area, and hence biomass or wood volume, at the stand scale (i.e. in studies of impacts of management or site quality) where there are budgetary constraints, noting the importance of sufficient sample sizes to ensure that the population sampled represents the true population.

  10. A design of experiments approach to validation sampling for logistic regression modeling with error-prone medical records.

    PubMed

    Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay

    2016-04-01

    Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Enhancing adaptive sparse grid approximations and improving refinement strategies using adjoint-based a posteriori error estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jakeman, J.D., E-mail: jdjakem@sandia.gov; Wildey, T.

    2015-01-01

    In this paper we present an algorithm for adaptive sparse grid approximations of quantities of interest computed from discretized partial differential equations. We use adjoint-based a posteriori error estimates of the physical discretization error and the interpolation error in the sparse grid to enhance the sparse grid approximation and to drive adaptivity of the sparse grid. Utilizing these error estimates provides significantly more accurate functional values for random samples of the sparse grid approximation. We also demonstrate that alternative refinement strategies based upon a posteriori error estimates can lead to further increases in accuracy in the approximation over traditional hierarchicalmore » surplus based strategies. Throughout this paper we also provide and test a framework for balancing the physical discretization error with the stochastic interpolation error of the enhanced sparse grid approximation.« less

  12. Random and systematic sampling error when hooking fish to monitor skin fluke (Benedenia seriolae) and gill fluke (Zeuxapta seriolae) burden in Australian farmed yellowtail kingfish (Seriola lalandi).

    PubMed

    Fensham, J R; Bubner, E; D'Antignana, T; Landos, M; Caraguel, C G B

    2018-05-01

    The Australian farmed yellowtail kingfish (Seriola lalandi, YTK) industry monitor skin fluke (Benedenia seriolae) and gill fluke (Zeuxapta seriolae) burden by pooling the fluke count of 10 hooked YTK. The random and systematic error of this sampling strategy was evaluated to assess potential impact on treatment decisions. Fluke abundance (fluke count per fish) in a study cage (estimated 30,502 fish) was assessed five times using the current sampling protocol and its repeatability was estimated the repeatability coefficient (CR) and the coefficient of variation (CV). Individual body weight, fork length, fluke abundance, prevalence, intensity (fluke count per infested fish) and density (fluke count per Kg of fish) were compared between 100 hooked and 100 seined YTK (assumed representative of the entire population) to estimate potential selection bias. Depending on the fluke species and age category, CR (expected difference in parasite count between 2 sampling iterations) ranged from 0.78 to 114 flukes per fish. Capturing YTK by hooking increased the selection of fish of a weight and length in the lowest 5th percentile of the cage (RR = 5.75, 95% CI: 2.06-16.03, P-value = 0.0001). These lower end YTK had on average an extra 31 juveniles and 6 adults Z. seriolae per Kg of fish and an extra 3 juvenile and 0.4 adult B. seriolae per Kg of fish, compared to the rest of the cage population (P-value < 0.05). Hooking YTK on the edge of the study cage biases sampling towards the smallest and most heavily infested fish in the population, resulting in poor repeatability (more variability amongst sampled fish) and an overestimation of parasite burden in the population. In this particular commercial situation these finding supported that health management program, where the finding of an underestimation of parasite burden could provide a production impact on the study population. In instances where fish populations and parasite burdens are more homogenous, sampling error may be less severe. Sampling error when capturing fish from sea cage is difficult to predict. The amplitude and direction of this error should be investigated for a given cultured fish species across a range of parasite burden and fish profile scenarios. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Oscillating-flow regenerator test rig: Woven screen and metal felt results

    NASA Technical Reports Server (NTRS)

    Gedeon, D.; Wood, J. G.

    1992-01-01

    We present correlating expressions, in terms of Reynolds or Peclet numbers, for friction factors, Nusselt numbers, enhanced axial conduction ratios, and overall heat flux ratios in four porous regenerator samples representative of stirling cycle regenerators: two woven screen samples and two random wire samples. Error estimates and comparison of data with others suggest our correlations are reliable, but we need to test more samples over a range of porosities before our results will become generally useful.

  14. Nonconvergence of the Wang-Landau algorithms with multiple random walkers.

    PubMed

    Belardinelli, R E; Pereyra, V D

    2016-05-01

    This paper discusses some convergence properties in the entropic sampling Monte Carlo methods with multiple random walkers, particularly in the Wang-Landau (WL) and 1/t algorithms. The classical algorithms are modified by the use of m-independent random walkers in the energy landscape to calculate the density of states (DOS). The Ising model is used to show the convergence properties in the calculation of the DOS, as well as the critical temperature, while the calculation of the number π by multiple dimensional integration is used in the continuum approximation. In each case, the error is obtained separately for each walker at a fixed time, t; then, the average over m walkers is performed. It is observed that the error goes as 1/sqrt[m]. However, if the number of walkers increases above a certain critical value m>m_{x}, the error reaches a constant value (i.e., it saturates). This occurs for both algorithms; however, it is shown that for a given system, the 1/t algorithm is more efficient and accurate than the similar version of the WL algorithm. It follows that it makes no sense to increase the number of walkers above a critical value m_{x}, since it does not reduce the error in the calculation. Therefore, the number of walkers does not guarantee convergence.

  15. Randomized clinical trials in dentistry: Risks of bias, risks of random errors, reporting quality, and methodologic quality over the years 1955–2013

    PubMed Central

    Armijo-Olivo, Susan; Cummings, Greta G.; Amin, Maryam; Flores-Mir, Carlos

    2017-01-01

    Objectives To examine the risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions and the development of these aspects over time. Methods We included 540 randomized clinical trials from 64 selected systematic reviews. We extracted, in duplicate, details from each of the selected randomized clinical trials with respect to publication and trial characteristics, reporting and methodologic characteristics, and Cochrane risk of bias domains. We analyzed data using logistic regression and Chi-square statistics. Results Sequence generation was assessed to be inadequate (at unclear or high risk of bias) in 68% (n = 367) of the trials, while allocation concealment was inadequate in the majority of trials (n = 464; 85.9%). Blinding of participants and blinding of the outcome assessment were judged to be inadequate in 28.5% (n = 154) and 40.5% (n = 219) of the trials, respectively. A sample size calculation before the initiation of the study was not performed/reported in 79.1% (n = 427) of the trials, while the sample size was assessed as adequate in only 17.6% (n = 95) of the trials. Two thirds of the trials were not described as double blinded (n = 358; 66.3%), while the method of blinding was appropriate in 53% (n = 286) of the trials. We identified a significant decrease over time (1955–2013) in the proportion of trials assessed as having inadequately addressed methodological quality items (P < 0.05) in 30 out of the 40 quality criteria, or as being inadequate (at high or unclear risk of bias) in five domains of the Cochrane risk of bias tool: sequence generation, allocation concealment, incomplete outcome data, other sources of bias, and overall risk of bias. Conclusions The risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions have improved over time; however, further efforts that contribute to the development of more stringent methodology and detailed reporting of trials are still needed. PMID:29272315

  16. Randomized clinical trials in dentistry: Risks of bias, risks of random errors, reporting quality, and methodologic quality over the years 1955-2013.

    PubMed

    Saltaji, Humam; Armijo-Olivo, Susan; Cummings, Greta G; Amin, Maryam; Flores-Mir, Carlos

    2017-01-01

    To examine the risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions and the development of these aspects over time. We included 540 randomized clinical trials from 64 selected systematic reviews. We extracted, in duplicate, details from each of the selected randomized clinical trials with respect to publication and trial characteristics, reporting and methodologic characteristics, and Cochrane risk of bias domains. We analyzed data using logistic regression and Chi-square statistics. Sequence generation was assessed to be inadequate (at unclear or high risk of bias) in 68% (n = 367) of the trials, while allocation concealment was inadequate in the majority of trials (n = 464; 85.9%). Blinding of participants and blinding of the outcome assessment were judged to be inadequate in 28.5% (n = 154) and 40.5% (n = 219) of the trials, respectively. A sample size calculation before the initiation of the study was not performed/reported in 79.1% (n = 427) of the trials, while the sample size was assessed as adequate in only 17.6% (n = 95) of the trials. Two thirds of the trials were not described as double blinded (n = 358; 66.3%), while the method of blinding was appropriate in 53% (n = 286) of the trials. We identified a significant decrease over time (1955-2013) in the proportion of trials assessed as having inadequately addressed methodological quality items (P < 0.05) in 30 out of the 40 quality criteria, or as being inadequate (at high or unclear risk of bias) in five domains of the Cochrane risk of bias tool: sequence generation, allocation concealment, incomplete outcome data, other sources of bias, and overall risk of bias. The risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions have improved over time; however, further efforts that contribute to the development of more stringent methodology and detailed reporting of trials are still needed.

  17. Investigation of technology needs for avoiding helicopter pilot error related accidents

    NASA Technical Reports Server (NTRS)

    Chais, R. I.; Simpson, W. E.

    1985-01-01

    Pilot error which is cited as a cause or related factor in most rotorcraft accidents was examined. Pilot error related accidents in helicopters to identify areas in which new technology could reduce or eliminate the underlying causes of these human errors were investigated. The aircraft accident data base at the U.S. Army Safety Center was studied as the source of data on helicopter accidents. A randomly selected sample of 110 aircraft records were analyzed on a case-by-case basis to assess the nature of problems which need to be resolved and applicable technology implications. Six technology areas in which there appears to be a need for new or increased emphasis are identified.

  18. The Effect of Cluster Sampling Design in Survey Research on the Standard Error Statistic.

    ERIC Educational Resources Information Center

    Wang, Lin; Fan, Xitao

    Standard statistical methods are used to analyze data that is assumed to be collected using a simple random sampling scheme. These methods, however, tend to underestimate variance when the data is collected with a cluster design, which is often found in educational survey research. The purposes of this paper are to demonstrate how a cluster design…

  19. Type I and Type II Error Rates and Overall Accuracy of the Revised Parallel Analysis Method for Determining the Number of Factors

    ERIC Educational Resources Information Center

    Green, Samuel B.; Thompson, Marilyn S.; Levy, Roy; Lo, Wen-Juo

    2015-01-01

    Traditional parallel analysis (T-PA) estimates the number of factors by sequentially comparing sample eigenvalues with eigenvalues for randomly generated data. Revised parallel analysis (R-PA) sequentially compares the "k"th eigenvalue for sample data to the "k"th eigenvalue for generated data sets, conditioned on"k"-…

  20. Ozone measurement system for NASA global air sampling program

    NASA Technical Reports Server (NTRS)

    Tiefermann, M. W.

    1979-01-01

    The ozone measurement system used in the NASA Global Air Sampling Program is described. The system uses a commercially available ozone concentration monitor that was modified and repackaged so as to operate unattended in an aircraft environment. The modifications required for aircraft use are described along with the calibration techniques, the measurement of ozone loss in the sample lines, and the operating procedures that were developed for use in the program. Based on calibrations with JPL's 5-meter ultraviolet photometer, all previously published GASP ozone data are biased high by 9 percent. A system error analysis showed that the total system measurement random error is from 3 to 8 percent of reading (depending on the pump diaphragm material) or 3 ppbv, whichever are greater.

  1. Filtering Drifter Trajectories Sampled at Submesoscale Resolution

    DTIC Science & Technology

    2015-07-10

    interval 5 min and a positioning error 1.5 m, the acceleration error is 4 10 m/s , a value comparable with the typical Coriolis acceleration of a water...10 ms , corresponding to the Coriolis acceleration experi- enced by a water parcel traveling at a speed of 2.2 m/s. This value corresponds to the...computed by integrating the NCOM velocity field contaminated by a random walk process whose effective dispersion coefficient (150 m /s) was specified as the

  2. Sampling procedures for throughfall monitoring: A simulation study

    NASA Astrophysics Data System (ADS)

    Zimmermann, Beate; Zimmermann, Alexander; Lark, Richard Murray; Elsenbeer, Helmut

    2010-01-01

    What is the most appropriate sampling scheme to estimate event-based average throughfall? A satisfactory answer to this seemingly simple question has yet to be found, a failure which we attribute to previous efforts' dependence on empirical studies. Here we try to answer this question by simulating stochastic throughfall fields based on parameters for statistical models of large monitoring data sets. We subsequently sampled these fields with different sampling designs and variable sample supports. We evaluated the performance of a particular sampling scheme with respect to the uncertainty of possible estimated means of throughfall volumes. Even for a relative error limit of 20%, an impractically large number of small, funnel-type collectors would be required to estimate mean throughfall, particularly for small events. While stratification of the target area is not superior to simple random sampling, cluster random sampling involves the risk of being less efficient. A larger sample support, e.g., the use of trough-type collectors, considerably reduces the necessary sample sizes and eliminates the sensitivity of the mean to outliers. Since the gain in time associated with the manual handling of troughs versus funnels depends on the local precipitation regime, the employment of automatically recording clusters of long troughs emerges as the most promising sampling scheme. Even so, a relative error of less than 5% appears out of reach for throughfall under heterogeneous canopies. We therefore suspect a considerable uncertainty of input parameters for interception models derived from measured throughfall, in particular, for those requiring data of small throughfall events.

  3. Hazard Function Estimation with Cause-of-Death Data Missing at Random

    PubMed Central

    Wang, Qihua; Dinse, Gregg E.; Liu, Chunling

    2010-01-01

    Hazard function estimation is an important part of survival analysis. Interest often centers on estimating the hazard function associated with a particular cause of death. We propose three nonparametric kernel estimators for the hazard function, all of which are appropriate when death times are subject to random censorship and censoring indicators can be missing at random. Specifically, we present a regression surrogate estimator, an imputation estimator, and an inverse probability weighted estimator. All three estimators are uniformly strongly consistent and asymptotically normal. We derive asymptotic representations of the mean squared error and the mean integrated squared error for these estimators and we discuss a data-driven bandwidth selection method. A simulation study, conducted to assess finite sample behavior, demonstrates that the proposed hazard estimators perform relatively well. We illustrate our methods with an analysis of some vascular disease data. PMID:22267874

  4. Entropy-Based TOA Estimation and SVM-Based Ranging Error Mitigation in UWB Ranging Systems

    PubMed Central

    Yin, Zhendong; Cui, Kai; Wu, Zhilu; Yin, Liang

    2015-01-01

    The major challenges for Ultra-wide Band (UWB) indoor ranging systems are the dense multipath and non-line-of-sight (NLOS) problems of the indoor environment. To precisely estimate the time of arrival (TOA) of the first path (FP) in such a poor environment, a novel approach of entropy-based TOA estimation and support vector machine (SVM) regression-based ranging error mitigation is proposed in this paper. The proposed method can estimate the TOA precisely by measuring the randomness of the received signals and mitigate the ranging error without the recognition of the channel conditions. The entropy is used to measure the randomness of the received signals and the FP can be determined by the decision of the sample which is followed by a great entropy decrease. The SVM regression is employed to perform the ranging-error mitigation by the modeling of the regressor between the characteristics of received signals and the ranging error. The presented numerical simulation results show that the proposed approach achieves significant performance improvements in the CM1 to CM4 channels of the IEEE 802.15.4a standard, as compared to conventional approaches. PMID:26007726

  5. Effect of patient setup errors on simultaneously integrated boost head and neck IMRT treatment plans

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Siebers, Jeffrey V.; Keall, Paul J.; Wu Qiuwen

    2005-10-01

    Purpose: The purpose of this study is to determine dose delivery errors that could result from random and systematic setup errors for head-and-neck patients treated using the simultaneous integrated boost (SIB)-intensity-modulated radiation therapy (IMRT) technique. Methods and Materials: Twenty-four patients who participated in an intramural Phase I/II parotid-sparing IMRT dose-escalation protocol using the SIB treatment technique had their dose distributions reevaluated to assess the impact of random and systematic setup errors. The dosimetric effect of random setup error was simulated by convolving the two-dimensional fluence distribution of each beam with the random setup error probability density distribution. Random setup errorsmore » of {sigma} = 1, 3, and 5 mm were simulated. Systematic setup errors were simulated by randomly shifting the patient isocenter along each of the three Cartesian axes, with each shift selected from a normal distribution. Systematic setup error distributions with {sigma} = 1.5 and 3.0 mm along each axis were simulated. Combined systematic and random setup errors were simulated for {sigma} = {sigma} = 1.5 and 3.0 mm along each axis. For each dose calculation, the gross tumor volume (GTV) received by 98% of the volume (D{sub 98}), clinical target volume (CTV) D{sub 90}, nodes D{sub 90}, cord D{sub 2}, and parotid D{sub 50} and parotid mean dose were evaluated with respect to the plan used for treatment for the structure dose and for an effective planning target volume (PTV) with a 3-mm margin. Results: Simultaneous integrated boost-IMRT head-and-neck treatment plans were found to be less sensitive to random setup errors than to systematic setup errors. For random-only errors, errors exceeded 3% only when the random setup error {sigma} exceeded 3 mm. Simulated systematic setup errors with {sigma} = 1.5 mm resulted in approximately 10% of plan having more than a 3% dose error, whereas a {sigma} = 3.0 mm resulted in half of the plans having more than a 3% dose error and 28% with a 5% dose error. Combined random and systematic dose errors with {sigma} = {sigma} = 3.0 mm resulted in more than 50% of plans having at least a 3% dose error and 38% of the plans having at least a 5% dose error. Evaluation with respect to a 3-mm expanded PTV reduced the observed dose deviations greater than 5% for the {sigma} = {sigma} = 3.0 mm simulations to 5.4% of the plans simulated. Conclusions: Head-and-neck SIB-IMRT dosimetric accuracy would benefit from methods to reduce patient systematic setup errors. When GTV, CTV, or nodal volumes are used for dose evaluation, plans simulated including the effects of random and systematic errors deviate substantially from the nominal plan. The use of PTVs for dose evaluation in the nominal plan improves agreement with evaluated GTV, CTV, and nodal dose values under simulated setup errors. PTV concepts should be used for SIB-IMRT head-and-neck squamous cell carcinoma patients, although the size of the margins may be less than those used with three-dimensional conformal radiation therapy.« less

  6. A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

    DOE PAGES

    Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...

    2017-11-08

    Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less

  7. A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ye, Xin; Garikapati, Venu M.; You, Daehyun

    Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less

  8. A multiple-objective optimal exploration strategy

    USGS Publications Warehouse

    Christakos, G.; Olea, R.A.

    1988-01-01

    Exploration for natural resources is accomplished through partial sampling of extensive domains. Such imperfect knowledge is subject to sampling error. Complex systems of equations resulting from modelling based on the theory of correlated random fields are reduced to simple analytical expressions providing global indices of estimation variance. The indices are utilized by multiple objective decision criteria to find the best sampling strategies. The approach is not limited by geometric nature of the sampling, covers a wide range in spatial continuity and leads to a step-by-step procedure. ?? 1988.

  9. A two-phase sampling survey for nonresponse and its paradata to correct nonresponse bias in a health surveillance survey.

    PubMed

    Santin, G; Bénézet, L; Geoffroy-Perez, B; Bouyer, J; Guéguen, A

    2017-02-01

    The decline in participation rates in surveys, including epidemiological surveillance surveys, has become a real concern since it may increase nonresponse bias. The aim of this study is to estimate the contribution of a complementary survey among a subsample of nonrespondents, and the additional contribution of paradata in correcting for nonresponse bias in an occupational health surveillance survey. In 2010, 10,000 workers were randomly selected and sent a postal questionnaire. Sociodemographic data were available for the whole sample. After data collection of the questionnaires, a complementary survey among a random subsample of 500 nonrespondents was performed using a questionnaire administered by an interviewer. Paradata were collected for the complete subsample of the complementary survey. Nonresponse bias in the initial sample and in the combined samples were assessed using variables from administrative databases available for the whole sample, not subject to differential measurement errors. Corrected prevalences by reweighting technique were estimated by first using the initial survey alone and then the initial and complementary surveys combined, under several assumptions regarding the missing data process. Results were compared by computing relative errors. The response rates of the initial and complementary surveys were 23.6% and 62.6%, respectively. For the initial and the combined surveys, the relative errors decreased after correction for nonresponse on sociodemographic variables. For the combined surveys without paradata, relative errors decreased compared with the initial survey. The contribution of the paradata was weak. When a complex descriptive survey has a low response rate, a short complementary survey among nonrespondents with a protocol which aims to maximize the response rates, is useful. The contribution of sociodemographic variables in correcting for nonresponse bias is important whereas the additional contribution of paradata in correcting for nonresponse bias is questionable. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  10. Error baseline rates of five sample preparation methods used to characterize RNA virus populations.

    PubMed

    Kugelman, Jeffrey R; Wiley, Michael R; Nagle, Elyse R; Reyes, Daniel; Pfeffer, Brad P; Kuhn, Jens H; Sanchez-Lockhart, Mariano; Palacios, Gustavo F

    2017-01-01

    Individual RNA viruses typically occur as populations of genomes that differ slightly from each other due to mutations introduced by the error-prone viral polymerase. Understanding the variability of RNA virus genome populations is critical for understanding virus evolution because individual mutant genomes may gain evolutionary selective advantages and give rise to dominant subpopulations, possibly even leading to the emergence of viruses resistant to medical countermeasures. Reverse transcription of virus genome populations followed by next-generation sequencing is the only available method to characterize variation for RNA viruses. However, both steps may lead to the introduction of artificial mutations, thereby skewing the data. To better understand how such errors are introduced during sample preparation, we determined and compared error baseline rates of five different sample preparation methods by analyzing in vitro transcribed Ebola virus RNA from an artificial plasmid-based system. These methods included: shotgun sequencing from plasmid DNA or in vitro transcribed RNA as a basic "no amplification" method, amplicon sequencing from the plasmid DNA or in vitro transcribed RNA as a "targeted" amplification method, sequence-independent single-primer amplification (SISPA) as a "random" amplification method, rolling circle reverse transcription sequencing (CirSeq) as an advanced "no amplification" method, and Illumina TruSeq RNA Access as a "targeted" enrichment method. The measured error frequencies indicate that RNA Access offers the best tradeoff between sensitivity and sample preparation error (1.4-5) of all compared methods.

  11. Proportion of medication error reporting and associated factors among nurses: a cross sectional study.

    PubMed

    Jember, Abebaw; Hailu, Mignote; Messele, Anteneh; Demeke, Tesfaye; Hassen, Mohammed

    2018-01-01

    A medication error (ME) is any preventable event that may cause or lead to inappropriate medication use or patient harm. Voluntary reporting has a principal role in appreciating the extent and impact of medication errors. Thus, exploration of the proportion of medication error reporting and associated factors among nurses is important to inform service providers and program implementers so as to improve the quality of the healthcare services. Institution based quantitative cross-sectional study was conducted among 397 nurses from March 6 to May 10, 2015. Stratified sampling followed by simple random sampling technique was used to select the study participants. The data were collected using structured self-administered questionnaire which was adopted from studies conducted in Australia and Jordan. A pilot study was carried out to validate the questionnaire before data collection for this study. Bivariate and multivariate logistic regression models were fitted to identify factors associated with the proportion of medication error reporting among nurses. An adjusted odds ratio with 95% confidence interval was computed to determine the level of significance. The proportion of medication error reporting among nurses was found to be 57.4%. Regression analysis showed that sex, marital status, having made a medication error and medication error experience were significantly associated with medication error reporting. The proportion of medication error reporting among nurses in this study was found to be higher than other studies.

  12. Single molecule counting and assessment of random molecular tagging errors with transposable giga-scale error-correcting barcodes.

    PubMed

    Lau, Billy T; Ji, Hanlee P

    2017-09-21

    RNA-Seq measures gene expression by counting sequence reads belonging to unique cDNA fragments. Molecular barcodes commonly in the form of random nucleotides were recently introduced to improve gene expression measures by detecting amplification duplicates, but are susceptible to errors generated during PCR and sequencing. This results in false positive counts, leading to inaccurate transcriptome quantification especially at low input and single-cell RNA amounts where the total number of molecules present is minuscule. To address this issue, we demonstrated the systematic identification of molecular species using transposable error-correcting barcodes that are exponentially expanded to tens of billions of unique labels. We experimentally showed random-mer molecular barcodes suffer from substantial and persistent errors that are difficult to resolve. To assess our method's performance, we applied it to the analysis of known reference RNA standards. By including an inline random-mer molecular barcode, we systematically characterized the presence of sequence errors in random-mer molecular barcodes. We observed that such errors are extensive and become more dominant at low input amounts. We described the first study to use transposable molecular barcodes and its use for studying random-mer molecular barcode errors. Extensive errors found in random-mer molecular barcodes may warrant the use of error correcting barcodes for transcriptome analysis as input amounts decrease.

  13. Estimation of population mean under systematic sampling

    NASA Astrophysics Data System (ADS)

    Noor-ul-amin, Muhammad; Javaid, Amjad

    2017-11-01

    In this study we propose a generalized ratio estimator under non-response for systematic random sampling. We also generate a class of estimators through special cases of generalized estimator using different combinations of coefficients of correlation, kurtosis and variation. The mean square errors and mathematical conditions are also derived to prove the efficiency of proposed estimators. Numerical illustration is included using three populations to support the results.

  14. Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

    ERIC Educational Resources Information Center

    Michaelides, Michalis P.; Haertel, Edward H.

    2014-01-01

    The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

  15. Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis

    ERIC Educational Resources Information Center

    Marin-Martinez, Fulgencio; Sanchez-Meca, Julio

    2010-01-01

    Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…

  16. Santa Clara County Survey of Drug, Alcohol, and Tobacco Use among Students in Grades 5, 7, 9, 11.

    ERIC Educational Resources Information Center

    Constantine, Norm; And Others

    This report presents findings from the Santa Clara County (California) survey of Drug, Alcohol, and Tobacco Use among Students in Grades 5, 7, 9, and 11 administered during the spring of 1991 to 5,180 students in 51 randomly selected county schools. An executive summary discusses sampling error, sample demographics, and findings on drug use…

  17. Technical Note: Introduction of variance component analysis to setup error analysis in radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matsuo, Yukinori, E-mail: ymatsuo@kuhp.kyoto-u.ac.

    Purpose: The purpose of this technical note is to introduce variance component analysis to the estimation of systematic and random components in setup error of radiotherapy. Methods: Balanced data according to the one-factor random effect model were assumed. Results: Analysis-of-variance (ANOVA)-based computation was applied to estimate the values and their confidence intervals (CIs) for systematic and random errors and the population mean of setup errors. The conventional method overestimates systematic error, especially in hypofractionated settings. The CI for systematic error becomes much wider than that for random error. The ANOVA-based estimation can be extended to a multifactor model considering multiplemore » causes of setup errors (e.g., interpatient, interfraction, and intrafraction). Conclusions: Variance component analysis may lead to novel applications to setup error analysis in radiotherapy.« less

  18. Forecasting the brittle failure of heterogeneous, porous geomaterials

    NASA Astrophysics Data System (ADS)

    Vasseur, Jérémie; Wadsworth, Fabian; Heap, Michael; Main, Ian; Lavallée, Yan; Dingwell, Donald

    2017-04-01

    Heterogeneity develops in magmas during ascent and is dominated by the development of crystal and importantly, bubble populations or pore-network clusters which grow, interact, localize, coalesce, outgas and resorb. Pore-scale heterogeneity is also ubiquitous in sedimentary basin fill during diagenesis. As a first step, we construct numerical simulations in 3D in which randomly generated heterogeneous and polydisperse spheres are placed in volumes and which are permitted to overlap with one another, designed to represent the random growth and interaction of bubbles in a liquid volume. We use these simulated geometries to show that statistical predictions of the inter-bubble lengthscales and evolving bubble surface area or cluster densities can be made based on fundamental percolation theory. As a second step, we take a range of well constrained random heterogeneous rock samples including sandstones, andesites, synthetic partially sintered glass bead samples, and intact glass samples and subject them to a variety of stress loading conditions at a range of temperatures until failure. We record in real time the evolution of the number of acoustic events that precede failure and show that in all scenarios, the acoustic event rate accelerates toward failure, consistent with previous findings. Applying tools designed to forecast the failure time based on these precursory signals, we constrain the absolute error on the forecast time. We find that for all sample types, the error associated with an accurate forecast of failure scales non-linearly with the lengthscale between the pore clusters in the material. Moreover, using a simple micromechanical model for the deformation of porous elastic bodies, we show that the ratio between the equilibrium sub-critical crack length emanating from the pore clusters relative to the inter-pore lengthscale, provides a scaling for the error on forecast accuracy. Thus for the first time we provide a potential quantitative correction for forecasting the failure of porous brittle solids that build the Earth's crust.

  19. Medication errors versus time of admission in a subpopulation of stroke patients undergoing inpatient rehabilitation complications and considerations.

    PubMed

    Pitts, Eric P

    2011-01-01

    This study looked at the medication ordering error frequency and the length of inpatient hospital stay in a subpopulation of stroke patients (n-60) as a function of time of patient admission to an inpatient rehabilitation hospital service. A total of 60 inpatient rehabilitation patients, 30 arriving before 4 pm, and 30 arriving after 4 pm, with as admitting diagnosis of stroke were randomly selected from a larger sample (N=426). There was a statistically significant increase in medication ordering errors and the number of inpatient rehabilitation hospital days in the group of patients who arrived after 4 pm.

  20. Error sensitivity analysis in 10-30-day extended range forecasting by using a nonlinear cross-prediction error model

    NASA Astrophysics Data System (ADS)

    Xia, Zhiye; Xu, Lisheng; Chen, Hongbin; Wang, Yongqian; Liu, Jinbao; Feng, Wenlan

    2017-06-01

    Extended range forecasting of 10-30 days, which lies between medium-term and climate prediction in terms of timescale, plays a significant role in decision-making processes for the prevention and mitigation of disastrous meteorological events. The sensitivity of initial error, model parameter error, and random error in a nonlinear crossprediction error (NCPE) model, and their stability in the prediction validity period in 10-30-day extended range forecasting, are analyzed quantitatively. The associated sensitivity of precipitable water, temperature, and geopotential height during cases of heavy rain and hurricane is also discussed. The results are summarized as follows. First, the initial error and random error interact. When the ratio of random error to initial error is small (10-6-10-2), minor variation in random error cannot significantly change the dynamic features of a chaotic system, and therefore random error has minimal effect on the prediction. When the ratio is in the range of 10-1-2 (i.e., random error dominates), attention should be paid to the random error instead of only the initial error. When the ratio is around 10-2-10-1, both influences must be considered. Their mutual effects may bring considerable uncertainty to extended range forecasting, and de-noising is therefore necessary. Second, in terms of model parameter error, the embedding dimension m should be determined by the factual nonlinear time series. The dynamic features of a chaotic system cannot be depicted because of the incomplete structure of the attractor when m is small. When m is large, prediction indicators can vanish because of the scarcity of phase points in phase space. A method for overcoming the cut-off effect ( m > 4) is proposed. Third, for heavy rains, precipitable water is more sensitive to the prediction validity period than temperature or geopotential height; however, for hurricanes, geopotential height is most sensitive, followed by precipitable water.

  1. An investigation of error correcting techniques for OMV and AXAF

    NASA Technical Reports Server (NTRS)

    Ingels, Frank; Fryer, John

    1991-01-01

    The original objectives of this project were to build a test system for the NASA 255/223 Reed/Solomon encoding/decoding chip set and circuit board. This test system was then to be interfaced with a convolutional system at MSFC to examine the performance of the concantinated codes. After considerable work, it was discovered that the convolutional system could not function as needed. This report documents the design, construction, and testing of the test apparatus for the R/S chip set. The approach taken was to verify the error correcting behavior of the chip set by injecting known error patterns onto data and observing the results. Error sequences were generated using pseudo-random number generator programs, with Poisson time distribution between errors and Gaussian burst lengths. Sample means, variances, and number of un-correctable errors were calculated for each data set before testing.

  2. Systematic bias in genomic classification due to contaminating non-neoplastic tissue in breast tumor samples.

    PubMed

    Elloumi, Fathi; Hu, Zhiyuan; Li, Yan; Parker, Joel S; Gulley, Margaret L; Amos, Keith D; Troester, Melissa A

    2011-06-30

    Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification. To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors. Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability. Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.

  3. Carbon monoxide measurement in the global atmospheric sampling program

    NASA Technical Reports Server (NTRS)

    Dudzinski, T. J.

    1979-01-01

    The carbon monoxide measurement system used in the NASA Global Atmospheric Sampling Program (GASP) is described. The system used a modified version of a commercially available infrared absorption analyzer. The modifications increased the sensitivity of the analyzer to 1 ppmv full scale, with a limit of detectability of 0.02 ppmv. Packaging was modified for automatic, unattended operation in an aircraft environment. The GASP system is described along with analyzer operation, calibration procedures, and measurement errors. Uncertainty of the CO measurement over a 2-year period ranged from + or - 3 to + or - 13 percent of reading, plus an error due to random fluctuation of the output signal + or - 3 to + or - 15 ppbv.

  4. Modified Bat Algorithm for Feature Selection with the Wisconsin Diagnosis Breast Cancer (WDBC) Dataset

    PubMed

    Jeyasingh, Suganthi; Veluchamy, Malathi

    2017-05-01

    Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License

  5. Estimating random errors due to shot noise in backscatter lidar observations.

    PubMed

    Liu, Zhaoyan; Hunt, William; Vaughan, Mark; Hostetler, Chris; McGill, Matthew; Powell, Kathleen; Winker, David; Hu, Yongxiang

    2006-06-20

    We discuss the estimation of random errors due to shot noise in backscatter lidar observations that use either photomultiplier tube (PMT) or avalanche photodiode (APD) detectors. The statistical characteristics of photodetection are reviewed, and photon count distributions of solar background signals and laser backscatter signals are examined using airborne lidar observations at 532 nm using a photon-counting mode APD. Both distributions appear to be Poisson, indicating that the arrival at the photodetector of photons for these signals is a Poisson stochastic process. For Poisson- distributed signals, a proportional, one-to-one relationship is known to exist between the mean of a distribution and its variance. Although the multiplied photocurrent no longer follows a strict Poisson distribution in analog-mode APD and PMT detectors, the proportionality still exists between the mean and the variance of the multiplied photocurrent. We make use of this relationship by introducing the noise scale factor (NSF), which quantifies the constant of proportionality that exists between the root mean square of the random noise in a measurement and the square root of the mean signal. Using the NSF to estimate random errors in lidar measurements due to shot noise provides a significant advantage over the conventional error estimation techniques, in that with the NSF, uncertainties can be reliably calculated from or for a single data sample. Methods for evaluating the NSF are presented. Algorithms to compute the NSF are developed for the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations lidar and tested using data from the Lidar In-space Technology Experiment.

  6. Estimating Random Errors Due to Shot Noise in Backscatter Lidar Observations

    NASA Technical Reports Server (NTRS)

    Liu, Zhaoyan; Hunt, William; Vaughan, Mark A.; Hostetler, Chris A.; McGill, Matthew J.; Powell, Kathy; Winker, David M.; Hu, Yongxiang

    2006-01-01

    In this paper, we discuss the estimation of random errors due to shot noise in backscatter lidar observations that use either photomultiplier tube (PMT) or avalanche photodiode (APD) detectors. The statistical characteristics of photodetection are reviewed, and photon count distributions of solar background signals and laser backscatter signals are examined using airborne lidar observations at 532 nm using a photon-counting mode APD. Both distributions appear to be Poisson, indicating that the arrival at the photodetector of photons for these signals is a Poisson stochastic process. For Poisson-distributed signals, a proportional, one-to-one relationship is known to exist between the mean of a distribution and its variance. Although the multiplied photocurrent no longer follows a strict Poisson distribution in analog-mode APD and PMT detectors, the proportionality still exists between the mean and the variance of the multiplied photocurrent. We make use of this relationship by introducing the noise scale factor (NSF), which quantifies the constant of proportionality that exists between the root-mean-square of the random noise in a measurement and the square root of the mean signal. Using the NSF to estimate random errors in lidar measurements due to shot noise provides a significant advantage over the conventional error estimation techniques, in that with the NSF uncertainties can be reliably calculated from/for a single data sample. Methods for evaluating the NSF are presented. Algorithms to compute the NSF are developed for the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) lidar and tested using data from the Lidar In-space Technology Experiment (LITE). OCIS Codes:

  7. A multiobserver study of the effects of including point-of-care patient photographs with portable radiography: a means to detect wrong-patient errors.

    PubMed

    Tridandapani, Srini; Ramamurthy, Senthil; Provenzale, James; Obuchowski, Nancy A; Evanoff, Michael G; Bhatti, Pamela

    2014-08-01

    To evaluate whether the presence of facial photographs obtained at the point-of-care of portable radiography leads to increased detection of wrong-patient errors. In this institutional review board-approved study, 166 radiograph-photograph combinations were obtained from 30 patients. Consecutive radiographs from the same patients resulted in 83 unique pairs (ie, a new radiograph and prior, comparison radiograph) for interpretation. To simulate wrong-patient errors, mismatched pairs were generated by pairing radiographs from different patients chosen randomly from the sample. Ninety radiologists each interpreted a unique randomly chosen set of 10 radiographic pairs, containing up to 10% mismatches (ie, error pairs). Radiologists were randomly assigned to interpret radiographs with or without photographs. The number of mismatches was identified, and interpretation times were recorded. Ninety radiologists with 21 ± 10 (mean ± standard deviation) years of experience were recruited to participate in this observer study. With the introduction of photographs, the proportion of errors detected increased from 31% (9 of 29) to 77% (23 of 30; P = .006). The odds ratio for detection of error with photographs to detection without photographs was 7.3 (95% confidence interval: 2.29-23.18). Observer qualifications, training, or practice in cardiothoracic radiology did not influence sensitivity for error detection. There is no significant difference in interpretation time for studies without photographs and those with photographs (60 ± 22 vs. 61 ± 25 seconds; P = .77). In this observer study, facial photographs obtained simultaneously with portable chest radiographs increased the identification of any wrong-patient errors, without substantial increase in interpretation time. This technique offers a potential means to increase patient safety through correct patient identification. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.

  8. Evaluation of the depth-integration method of measuring water discharge in large rivers

    USGS Publications Warehouse

    Moody, J.A.; Troutman, B.M.

    1992-01-01

    The depth-integration method oor measuring water discharge makes a continuos measurement of the water velocity from the water surface to the bottom at 20 to 40 locations or verticals across a river. It is especially practical for large rivers where river traffic makes it impractical to use boats attached to taglines strung across the river or to use current meters suspended from bridges. This method has the additional advantage over the standard two- and eight-tenths method in that a discharge-weighted suspended-sediment sample can be collected at the same time. When this method is used in large rivers such as the Missouri, Mississippi and Ohio, a microwave navigation system is used to determine the ship's position at each vertical sampling location across the river, and to make accurate velocity corrections to compensate for shift drift. An essential feature is a hydraulic winch that can lower and raise the current meter at a constant transit velocity so that the velocities at all depths are measured for equal lengths of time. Field calibration measurements show that: (1) the mean velocity measured on the upcast (bottom to surface) is within 1% of the standard mean velocity determined by 9-11 point measurements; (2) if the transit velocity is less than 25% of the mean velocity, then average error in the mean velocity is 4% or less. The major source of bias error is a result of mounting the current meter above a sounding weight and sometimes above a suspended-sediment sampling bottle, which prevents measurement of the velocity all the way to the bottom. The measured mean velocity is slightly larger than the true mean velocity. This bias error in the discharge is largest in shallow water (approximately 8% for the Missouri River at Hermann, MO, where the mean depth was 4.3 m) and smallest in deeper water (approximately 3% for the Mississippi River at Vickbsurg, MS, where the mean depth was 14.5 m). The major source of random error in the discharge is the natural variability of river velocities, which we assumed to be independent and random at each vertical. The standard error of the estimated mean velocity, at an individual vertical sampling location, may be as large as 9%, for large sand-bed alluvial rivers. The computed discharge, however, is a weighted mean of these random velocities. Consequently the standard error of computed discharge is divided by the square root of the number of verticals, producing typical values between 1 and 2%. The discharges measured by the depth-integrated method agreed within ??5% of those measured simultaneously by the standard two- and eight-tenths, six-tenth and moving boat methods. ?? 1992.

  9. Comparison of error-based and errorless learning for people with severe traumatic brain injury: study protocol for a randomized control trial.

    PubMed

    Ownsworth, Tamara; Fleming, Jennifer; Tate, Robyn; Shum, David H K; Griffin, Janelle; Schmidt, Julia; Lane-Brown, Amanda; Kendall, Melissa; Chevignard, Mathilde

    2013-11-05

    Poor skills generalization poses a major barrier to successful outcomes of rehabilitation after traumatic brain injury (TBI). Error-based learning (EBL) is a relatively new intervention approach that aims to promote skills generalization by teaching people internal self-regulation skills, or how to anticipate, monitor and correct their own errors. This paper describes the protocol of a study that aims to compare the efficacy of EBL and errorless learning (ELL) for improving error self-regulation, behavioral competency, awareness of deficits and long-term outcomes after TBI. This randomized, controlled trial (RCT) has two arms (EBL and ELL); each arm entails 8 × 2 h training sessions conducted within the participants' homes. The first four sessions involve a meal preparation activity, and the final four sessions incorporate a multitasking errand activity. Based on a sample size estimate, 135 participants with severe TBI will be randomized into either the EBL or ELL condition. The primary outcome measure assesses error self-regulation skills on a task related to but distinct from training. Secondary outcomes include measures of self-monitoring and self-regulation, behavioral competency, awareness of deficits, role participation and supportive care needs. Assessments will be conducted at pre-intervention, post-intervention, and at 6-months post-intervention. This study seeks to determine the efficacy and long-term impact of EBL for training internal self-regulation strategies following severe TBI. In doing so, the study will advance theoretical understanding of the role of errors in task learning and skills generalization. EBL has the potential to reduce the length and costs of rehabilitation and lifestyle support because the techniques could enhance generalization success and lifelong application of strategies after TBI. ACTRN12613000585729.

  10. Building on crossvalidation for increasing the quality of geostatistical modeling

    USGS Publications Warehouse

    Olea, R.A.

    2012-01-01

    The random function is a mathematical model commonly used in the assessment of uncertainty associated with a spatially correlated attribute that has been partially sampled. There are multiple algorithms for modeling such random functions, all sharing the requirement of specifying various parameters that have critical influence on the results. The importance of finding ways to compare the methods and setting parameters to obtain results that better model uncertainty has increased as these algorithms have grown in number and complexity. Crossvalidation has been used in spatial statistics, mostly in kriging, for the analysis of mean square errors. An appeal of this approach is its ability to work with the same empirical sample available for running the algorithms. This paper goes beyond checking estimates by formulating a function sensitive to conditional bias. Under ideal conditions, such function turns into a straight line, which can be used as a reference for preparing measures of performance. Applied to kriging, deviations from the ideal line provide sensitivity to the semivariogram lacking in crossvalidation of kriging errors and are more sensitive to conditional bias than analyses of errors. In terms of stochastic simulation, in addition to finding better parameters, the deviations allow comparison of the realizations resulting from the applications of different methods. Examples show improvements of about 30% in the deviations and approximately 10% in the square root of mean square errors between reasonable starting modelling and the solutions according to the new criteria. ?? 2011 US Government.

  11. The Applicability of Standard Error of Measurement and Minimal Detectable Change to Motor Learning Research-A Behavioral Study.

    PubMed

    Furlan, Leonardo; Sterr, Annette

    2018-01-01

    Motor learning studies face the challenge of differentiating between real changes in performance and random measurement error. While the traditional p -value-based analyses of difference (e.g., t -tests, ANOVAs) provide information on the statistical significance of a reported change in performance scores, they do not inform as to the likely cause or origin of that change, that is, the contribution of both real modifications in performance and random measurement error to the reported change. One way of differentiating between real change and random measurement error is through the utilization of the statistics of standard error of measurement (SEM) and minimal detectable change (MDC). SEM is estimated from the standard deviation of a sample of scores at baseline and a test-retest reliability index of the measurement instrument or test employed. MDC, in turn, is estimated from SEM and a degree of confidence, usually 95%. The MDC value might be regarded as the minimum amount of change that needs to be observed for it to be considered a real change, or a change to which the contribution of real modifications in performance is likely to be greater than that of random measurement error. A computer-based motor task was designed to illustrate the applicability of SEM and MDC to motor learning research. Two studies were conducted with healthy participants. Study 1 assessed the test-retest reliability of the task and Study 2 consisted in a typical motor learning study, where participants practiced the task for five consecutive days. In Study 2, the data were analyzed with a traditional p -value-based analysis of difference (ANOVA) and also with SEM and MDC. The findings showed good test-retest reliability for the task and that the p -value-based analysis alone identified statistically significant improvements in performance over time even when the observed changes could in fact have been smaller than the MDC and thereby caused mostly by random measurement error, as opposed to by learning. We suggest therefore that motor learning studies could complement their p -value-based analyses of difference with statistics such as SEM and MDC in order to inform as to the likely cause or origin of any reported changes in performance.

  12. Accuracy and convergence of coupled finite-volume/Monte Carlo codes for plasma edge simulations of nuclear fusion reactors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghoos, K., E-mail: kristel.ghoos@kuleuven.be; Dekeyser, W.; Samaey, G.

    2016-10-01

    The plasma and neutral transport in the plasma edge of a nuclear fusion reactor is usually simulated using coupled finite volume (FV)/Monte Carlo (MC) codes. However, under conditions of future reactors like ITER and DEMO, convergence issues become apparent. This paper examines the convergence behaviour and the numerical error contributions with a simplified FV/MC model for three coupling techniques: Correlated Sampling, Random Noise and Robbins Monro. Also, practical procedures to estimate the errors in complex codes are proposed. Moreover, first results with more complex models show that an order of magnitude speedup can be achieved without any loss in accuracymore » by making use of averaging in the Random Noise coupling technique.« less

  13. Influences of sampling size and pattern on the uncertainty of correlation estimation between soil water content and its influencing factors

    NASA Astrophysics Data System (ADS)

    Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua

    2017-12-01

    In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.

  14. Prevalence of refractive errors among school children in gondar town, northwest ethiopia.

    PubMed

    Yared, Assefa Wolde; Belaynew, Wasie Taye; Destaye, Shiferaw; Ayanaw, Tsegaw; Zelalem, Eshete

    2012-10-01

    Many children with poor vision due to refractive error remain undiagnosed and perform poorly in school. The situation is worse in the Sub-Saharan Africa, including Ethiopia, and current information is lacking. The objective of this study is to determine the prevalence of refractive error among children enrolled in elementary schools in Gondar town, Ethiopia. This was a cross-sectional study of 1852 students in 8 elementary schools. Subjects were selected by multistage random sampling. The study parameters were visual acuity (VA) evaluation and ocular examination. VA was measured by staff optometrists with the Snellen E-chart while students with subnormal vision were examined using pinhole, retinoscopy evaluation and subjective refraction by ophthalmologists. The study cohort was comprised of 45.8% males and 54.2% females from 8 randomly selected elementary schools with a response rate of 93%. Refractive errors in either eye were present in 174 (9.4%) children. Of these, myopia was diagnosed in 55 (31.6%) children in the right and left eyes followed by hyperopia in 46 (26.4%) and 39 (22.4%) in the right and left eyes respectively. Low myopia was the most common refractive error in 61 (49.2%) and 68 (50%) children for the right and left eyes respectively. Refractive error among children is a common problem in Gondar town and needs to be assessed at every health evaluation of school children for timely treatment.

  15. Improved uncertainty quantification in nondestructive assay for nonproliferation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burr, Tom; Croft, Stephen; Jarman, Ken

    2016-12-01

    This paper illustrates methods to improve uncertainty quantification (UQ) for non-destructive assay (NDA) measurements used in nuclear nonproliferation. First, it is shown that current bottom-up UQ applied to calibration data is not always adequate, for three main reasons: (1) Because there are errors in both the predictors and the response, calibration involves a ratio of random quantities, and calibration data sets in NDA usually consist of only a modest number of samples (3–10); therefore, asymptotic approximations involving quantities needed for UQ such as means and variances are often not sufficiently accurate; (2) Common practice overlooks that calibration implies a partitioningmore » of total error into random and systematic error, and (3) In many NDA applications, test items exhibit non-negligible departures in physical properties from calibration items, so model-based adjustments are used, but item-specific bias remains in some data. Therefore, improved bottom-up UQ using calibration data should predict the typical magnitude of item-specific bias, and the suggestion is to do so by including sources of item-specific bias in synthetic calibration data that is generated using a combination of modeling and real calibration data. Second, for measurements of the same nuclear material item by both the facility operator and international inspectors, current empirical (top-down) UQ is described for estimating operator and inspector systematic and random error variance components. A Bayesian alternative is introduced that easily accommodates constraints on variance components, and is more robust than current top-down methods to the underlying measurement error distributions.« less

  16. Elimination of Emergency Department Medication Errors Due To Estimated Weights.

    PubMed

    Greenwalt, Mary; Griffen, David; Wilkerson, Jim

    2017-01-01

    From 7/2014 through 6/2015, 10 emergency department (ED) medication dosing errors were reported through the electronic incident reporting system of an urban academic medical center. Analysis of these medication errors identified inaccurate estimated weight on patients as the root cause. The goal of this project was to reduce weight-based dosing medication errors due to inaccurate estimated weights on patients presenting to the ED. Chart review revealed that 13.8% of estimated weights documented on admitted ED patients varied more than 10% from subsequent actual admission weights recorded. A random sample of 100 charts containing estimated weights revealed 2 previously unreported significant medication dosage errors (.02 significant error rate). Key improvements included removing barriers to weighing ED patients, storytelling to engage staff and change culture, and removal of the estimated weight documentation field from the ED electronic health record (EHR) forms. With these improvements estimated weights on ED patients, and the resulting medication errors, were eliminated.

  17. SNP selection and classification of genome-wide SNP data using stratified sampling random forests.

    PubMed

    Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K

    2012-09-01

    For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.

  18. Evaluation of errors in quantitative determination of asbestos in rock

    NASA Astrophysics Data System (ADS)

    Baietto, Oliviero; Marini, Paola; Vitaliti, Martina

    2016-04-01

    The quantitative determination of the content of asbestos in rock matrices is a complex operation which is susceptible to important errors. The principal methodologies for the analysis are Scanning Electron Microscopy (SEM) and Phase Contrast Optical Microscopy (PCOM). Despite the PCOM resolution is inferior to that of SEM, PCOM analysis has several advantages, including more representativity of the analyzed sample, more effective recognition of chrysotile and a lower cost. The DIATI LAA internal methodology for the analysis in PCOM is based on a mild grinding of a rock sample, its subdivision in 5-6 grain size classes smaller than 2 mm and a subsequent microscopic analysis of a portion of each class. The PCOM is based on the optical properties of asbestos and of the liquids with note refractive index in which the particles in analysis are immersed. The error evaluation in the analysis of rock samples, contrary to the analysis of airborne filters, cannot be based on a statistical distribution. In fact for airborne filters a binomial distribution (Poisson), which theoretically defines the variation in the count of fibers resulting from the observation of analysis fields, chosen randomly on the filter, can be applied. The analysis in rock matrices instead cannot lean on any statistical distribution because the most important object of the analysis is the size of the of asbestiform fibers and bundles of fibers observed and the resulting relationship between the weights of the fibrous component compared to the one granular. The error evaluation generally provided by public and private institutions varies between 50 and 150 percent, but there are not, however, specific studies that discuss the origin of the error or that link it to the asbestos content. Our work aims to provide a reliable estimation of the error in relation to the applied methodologies and to the total content of asbestos, especially for the values close to the legal limits. The error assessments must be made through the repetition of the same analysis on the same sample to try to estimate the error on the representativeness of the sample and the error related to the sensitivity of the operator, in order to provide a sufficiently reliable uncertainty of the method. We used about 30 natural rock samples with different asbestos content, performing 3 analysis on each sample to obtain a trend sufficiently representative of the percentage. Furthermore we made on one chosen sample 10 repetition of the analysis to try to define more specifically the error of the methodology.

  19. Error of the slanted edge method for measuring the modulation transfer function of imaging systems.

    PubMed

    Xie, Xufen; Fan, Hongda; Wang, Hongyuan; Wang, Zebin; Zou, Nianyu

    2018-03-01

    The slanted edge method is a basic approach for measuring the modulation transfer function (MTF) of imaging systems; however, its measurement accuracy is limited in practice. Theoretical analysis of the slanted edge MTF measurement method performed in this paper reveals that inappropriate edge angles and random noise reduce this accuracy. The error caused by edge angles is analyzed using sampling and reconstruction theory. Furthermore, an error model combining noise and edge angles is proposed. We verify the analyses and model with respect to (i) the edge angle, (ii) a statistical analysis of the measurement error, (iii) the full width at half-maximum of a point spread function, and (iv) the error model. The experimental results verify the theoretical findings. This research can be referential for applications of the slanted edge MTF measurement method.

  20. Sun compass error model

    NASA Technical Reports Server (NTRS)

    Blucker, T. J.; Ferry, W. W.

    1971-01-01

    An error model is described for the Apollo 15 sun compass, a contingency navigational device. Field test data are presented along with significant results of the test. The errors reported include a random error resulting from tilt in leveling the sun compass, a random error because of observer sighting inaccuracies, a bias error because of mean tilt in compass leveling, a bias error in the sun compass itself, and a bias error because the device is leveled to the local terrain slope.

  1. Strengths and weaknesses of temporal stability analysis for monitoring and estimating grid-mean soil moisture in a high-intensity irrigated agricultural landscape

    NASA Astrophysics Data System (ADS)

    Ran, Youhua; Li, Xin; Jin, Rui; Kang, Jian; Cosh, Michael H.

    2017-01-01

    Monitoring and estimating grid-mean soil moisture is very important for assessing many hydrological, biological, and biogeochemical processes and for validating remotely sensed surface soil moisture products. Temporal stability analysis (TSA) is a valuable tool for identifying a small number of representative sampling points to estimate the grid-mean soil moisture content. This analysis was evaluated and improved using high-quality surface soil moisture data that were acquired by a wireless sensor network in a high-intensity irrigated agricultural landscape in an arid region of northwestern China. The performance of the TSA was limited in areas where the representative error was dominated by random events, such as irrigation events. This shortcoming can be effectively mitigated by using a stratified TSA (STSA) method, proposed in this paper. In addition, the following methods were proposed for rapidly and efficiently identifying representative sampling points when using TSA. (1) Instantaneous measurements can be used to identify representative sampling points to some extent; however, the error resulting from this method is significant when validating remotely sensed soil moisture products. Thus, additional representative sampling points should be considered to reduce this error. (2) The calibration period can be determined from the time span of the full range of the grid-mean soil moisture content during the monitoring period. (3) The representative error is sensitive to the number of calibration sampling points, especially when only a few representative sampling points are used. Multiple sampling points are recommended to reduce data loss and improve the likelihood of representativeness at two scales.

  2. Statistical design and analysis of environmental studies for plutonium and other transuranics at NAEG ''safety-shot'' sites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, R.O.; Eberhardt, L.L.; Fowler, E.B.

    This paper is centered around the use of stratified random sampling for estimating the total amount (inventory) of $sup 239-240$Pu and uranium in surface soil at ten ''safety-shot'' sites on the Nevada Test Site (NTS) and Tonopah Test Range (TTR) that are currently being studied by the Nevada Applied Ecology Group (NAEG). The use of stratified random sampling has resulted in estimates of inventory at these desert study sites that have smaller standard errors than would have been the case had simple random sampling (no stratification) been used. Estimates of inventory are given for $sup 235$U, $sup 238$U, and $supmore » 239-240$Pu in soil at A Site of Area 11 on the NTS. Other results presented include average concentrations of one or more of these isotopes in soil and vegetation and in soil profile samples at depths to 25 cm. The regression relationship between soil and vegetation concentrations of $sup 235$U and $sup 238$U at adjacent sampling locations is also examined using three different models. The applicability of stratified random sampling to the estimation of concentration contours of $sup 239-240$Pu in surface soil using computer algorithms is also investigated. Estimates of such contours are obtained using several different methods. The planning of field sampling plans for estimating inventory and distribution is discussed. (auth)« less

  3. Aspen, climate, and sudden decline in western USA

    Treesearch

    Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston

    2009-01-01

    A bioclimate model predicting the presence or absence of aspen, Populus tremuloides, in western USA from climate variables was developed by using the Random Forests classification tree on Forest Inventory data from about 118,000 permanent sample plots. A reasonably parsimonious model used eight predictors to describe aspen's climate profile. Classification errors...

  4. Inadequate Self-Discipline as a Causal Factor in Human Error Accidents

    DTIC Science & Technology

    1991-03-01

    reasons for the effectiveness of the stimuli or incentives used are unknown. Moreover, Gebers and Peck (1987) conclude on the basis of a random sample...Research. 13, 141-156. 234 Gebers , M. A., & Peck, R. C. (1987). Basic California traffic conviction and accident record facts. (Report No. CAL-DMV- RSS

  5. Reference List Accuracy in Social Work Journals: A Follow-Up Analysis

    ERIC Educational Resources Information Center

    Mitchell-Williams, Missy T.; Skipper, Antonius D.; Alexander, Marvin C.; Wilks, Scott E.

    2017-01-01

    Purpose: Following up an "Research on Social Work Practice" article published a decade ago, this study aimed to examine reference error rates among five, widely circulated social work journals. Methods: A stratified random sample of references was selected from the year 2013 (N = 500, 100/journal). Each was verified against the original…

  6. Ensemble-type numerical uncertainty information from single model integrations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rauser, Florian, E-mail: florian.rauser@mpimet.mpg.de; Marotzke, Jochem; Korn, Peter

    2015-07-01

    We suggest an algorithm that quantifies the discretization error of time-dependent physical quantities of interest (goals) for numerical models of geophysical fluid dynamics. The goal discretization error is estimated using a sum of weighted local discretization errors. The key feature of our algorithm is that these local discretization errors are interpreted as realizations of a random process. The random process is determined by the model and the flow state. From a class of local error random processes we select a suitable specific random process by integrating the model over a short time interval at different resolutions. The weights of themore » influences of the local discretization errors on the goal are modeled as goal sensitivities, which are calculated via automatic differentiation. The integration of the weighted realizations of local error random processes yields a posterior ensemble of goal approximations from a single run of the numerical model. From the posterior ensemble we derive the uncertainty information of the goal discretization error. This algorithm bypasses the requirement of detailed knowledge about the models discretization to generate numerical error estimates. The algorithm is evaluated for the spherical shallow-water equations. For two standard test cases we successfully estimate the error of regional potential energy, track its evolution, and compare it to standard ensemble techniques. The posterior ensemble shares linear-error-growth properties with ensembles of multiple model integrations when comparably perturbed. The posterior ensemble numerical error estimates are of comparable size as those of a stochastic physics ensemble.« less

  7. Sampling design for spatially distributed hydrogeologic and environmental processes

    USGS Publications Warehouse

    Christakos, G.; Olea, R.A.

    1992-01-01

    A methodology for the design of sampling networks over space is proposed. The methodology is based on spatial random field representations of nonhomogeneous natural processes, and on optimal spatial estimation techniques. One of the most important results of random field theory for physical sciences is its rationalization of correlations in spatial variability of natural processes. This correlation is extremely important both for interpreting spatially distributed observations and for predictive performance. The extent of site sampling and the types of data to be collected will depend on the relationship of subsurface variability to predictive uncertainty. While hypothesis formulation and initial identification of spatial variability characteristics are based on scientific understanding (such as knowledge of the physics of the underlying phenomena, geological interpretations, intuition and experience), the support offered by field data is statistically modelled. This model is not limited by the geometric nature of sampling and covers a wide range in subsurface uncertainties. A factorization scheme of the sampling error variance is derived, which possesses certain atttactive properties allowing significant savings in computations. By means of this scheme, a practical sampling design procedure providing suitable indices of the sampling error variance is established. These indices can be used by way of multiobjective decision criteria to obtain the best sampling strategy. Neither the actual implementation of the in-situ sampling nor the solution of the large spatial estimation systems of equations are necessary. The required values of the accuracy parameters involved in the network design are derived using reference charts (readily available for various combinations of data configurations and spatial variability parameters) and certain simple yet accurate analytical formulas. Insight is gained by applying the proposed sampling procedure to realistic examples related to sampling problems in two dimensions. ?? 1992.

  8. A rational approach to legacy data validation when transitioning between electronic health record systems.

    PubMed

    Pageler, Natalie M; Grazier G'Sell, Max Jacob; Chandler, Warren; Mailes, Emily; Yang, Christine; Longhurst, Christopher A

    2016-09-01

    The objective of this project was to use statistical techniques to determine the completeness and accuracy of data migrated during electronic health record conversion. Data validation during migration consists of mapped record testing and validation of a sample of the data for completeness and accuracy. We statistically determined a randomized sample size for each data type based on the desired confidence level and error limits. The only error identified in the post go-live period was a failure to migrate some clinical notes, which was unrelated to the validation process. No errors in the migrated data were found during the 12- month post-implementation period. Compared to the typical industry approach, we have demonstrated that a statistical approach to sampling size for data validation can ensure consistent confidence levels while maximizing efficiency of the validation process during a major electronic health record conversion. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Verification of Satellite Rainfall Estimates from the Tropical Rainfall Measuring Mission over Ground Validation Sites

    NASA Astrophysics Data System (ADS)

    Fisher, B. L.; Wolff, D. B.; Silberstein, D. S.; Marks, D. M.; Pippitt, J. L.

    2007-12-01

    The Tropical Rainfall Measuring Mission's (TRMM) Ground Validation (GV) Program was originally established with the principal long-term goal of determining the random errors and systematic biases stemming from the application of the TRMM rainfall algorithms. The GV Program has been structured around two validation strategies: 1) determining the quantitative accuracy of the integrated monthly rainfall products at GV regional sites over large areas of about 500 km2 using integrated ground measurements and 2) evaluating the instantaneous satellite and GV rain rate statistics at spatio-temporal scales compatible with the satellite sensor resolution (Simpson et al. 1988, Thiele 1988). The GV Program has continued to evolve since the launch of the TRMM satellite on November 27, 1997. This presentation will discuss current GV methods of validating TRMM operational rain products in conjunction with ongoing research. The challenge facing TRMM GV has been how to best utilize rain information from the GV system to infer the random and systematic error characteristics of the satellite rain estimates. A fundamental problem of validating space-borne rain estimates is that the true mean areal rainfall is an ideal, scale-dependent parameter that cannot be directly measured. Empirical validation uses ground-based rain estimates to determine the error characteristics of the satellite-inferred rain estimates, but ground estimates also incur measurement errors and contribute to the error covariance. Furthermore, sampling errors, associated with the discrete, discontinuous temporal sampling by the rain sensors aboard the TRMM satellite, become statistically entangled in the monthly estimates. Sampling errors complicate the task of linking biases in the rain retrievals to the physics of the satellite algorithms. The TRMM Satellite Validation Office (TSVO) has made key progress towards effective satellite validation. For disentangling the sampling and retrieval errors, TSVO has developed and applied a methodology that statistically separates the two error sources. Using TRMM monthly estimates and high-resolution radar and gauge data, this method has been used to estimate sampling and retrieval error budgets over GV sites. More recently, a multi- year data set of instantaneous rain rates from the TRMM microwave imager (TMI), the precipitation radar (PR), and the combined algorithm was spatio-temporally matched and inter-compared to GV radar rain rates collected during satellite overpasses of select GV sites at the scale of the TMI footprint. The analysis provided a more direct probe of the satellite rain algorithms using ground data as an empirical reference. TSVO has also made significant advances in radar quality control through the development of the Relative Calibration Adjustment (RCA) technique. The RCA is currently being used to provide a long-term record of radar calibration for the radar at Kwajalein, a strategically important GV site in the tropical Pacific. The RCA technique has revealed previously undetected alterations in the radar sensitivity due to engineering changes (e.g., system modifications, antenna offsets, alterations of the receiver, or the data processor), making possible the correction of the radar rainfall measurements and ensuring the integrity of nearly a decade of TRMM GV observations and resources.

  10. Random errors in interferometry with the least-squares method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang Qi

    2011-01-20

    This investigation analyzes random errors in interferometric surface profilers using the least-squares method when random noises are present. Two types of random noise are considered here: intensity noise and position noise. Two formulas have been derived for estimating the standard deviations of the surface height measurements: one is for estimating the standard deviation when only intensity noise is present, and the other is for estimating the standard deviation when only position noise is present. Measurements on simulated noisy interferometric data have been performed, and standard deviations of the simulated measurements have been compared with those theoretically derived. The relationships havemore » also been discussed between random error and the wavelength of the light source and between random error and the amplitude of the interference fringe.« less

  11. Inference from clustering with application to gene-expression microarrays.

    PubMed

    Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M

    2002-01-01

    There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

  12. Refractive error and visual impairment in private school children in Ghana.

    PubMed

    Kumah, Ben D; Ebri, Anne; Abdul-Kabir, Mohammed; Ahmed, Abdul-Sadik; Koomson, Nana Ya; Aikins, Samual; Aikins, Amos; Amedo, Angela; Lartey, Seth; Naidoo, Kovin

    2013-12-01

    To assess the prevalence of refractive error and visual impairment in private school children in Ghana. A random selection of geographically defined classes in clusters was used to identify a sample of school children aged 12 to 15 years in the Ashanti Region. Children in 60 clusters were enumerated and examined in classrooms. The examination included visual acuity, retinoscopy, autorefraction under cycloplegia, and examination of anterior segment, media, and fundus. For quality assurance, a random sample of children with reduced and normal vision were selected and re-examined independently. A total of 2454 children attending 53 private schools were enumerated, and of these, 2435 (99.2%) were examined. Prevalence of uncorrected, presenting, and best visual acuity of 20/40 or worse in the better eye was 3.7, 3.5, and 0.4%, respectively. Refractive error was the cause of reduced vision in 71.7% of 152 eyes, amblyopia in 9.9%, retinal disorders in 5.9%, and corneal opacity in 4.6%. Exterior and anterior segment abnormalities occurred in 43 (1.8%) children. Myopia (at least -0.50 D) in one or both eyes was present in 3.2% of children when measured with retinoscopy and in 3.4% measured with autorefraction. Myopia was not significantly associated with gender (P = 0.82). Hyperopia (+2.00 D or more) in at least one eye was present in 0.3% of children with retinoscopy and autorefraction. The prevalence of reduced vision in Ghanaian private school children due to uncorrected refractive error was low. However, the prevalence of amblyopia, retinal disorders, and corneal opacities indicate the need for early interventions.

  13. Application of Lamendin's adult dental aging technique to a diverse skeletal sample.

    PubMed

    Prince, Debra A; Ubelaker, Douglas H

    2002-01-01

    Lamendin et al. (1) proposed a technique to estimate age at death for adults by analyzing single-rooted teeth. They expressed age as a function of two factors: translucency of the tooth root and periodontosis (gingival regression). In their study, they analyzed 306 singled rooted teeth that were extracted at autopsy from 208 individuals of known age at death, all of whom were considered as having a French ancestry. Their sample consisted of 135 males, 73 females, 198 whites, and 10 blacks. The sample ranged in age from 22 to 90 years of age. By using a simple formulae (A = 0.18 x P + 0.42 x T + 25.53, where A = Age in years, P = Periodontosis height x 100/root height, and T = Transparency height x 100/root height), Lamendin et al. were able to estimate age at death with a mean error of +/- 10 years on their working sample and +/- 8.4 years on a forensic control sample. Lamendin found this technique to work well with a French population, but did not test it outside of that sample area. This study tests the accuracy of this adult aging technique on a more diverse skeletal population, the Terry Collection housed at the Smithsonian's National Museum of Natural History. Our sample consists of 400 teeth from 94 black females, 72 white females, 98 black males, and 95 white males, ranging from 25 to 99 years. Lamendin's technique was applied to this sample to test its applicability to a population not of French origin. Providing results from a diverse skeletal population will aid in establishing the validity of this method to be used in forensic cases, its ideal purpose. Our results suggest that Lamendin's method estimates age fairly accurately outside of the French sample yielding a mean error of 8.2 years, standard deviation 6.9 years, and standard error of the mean 0.34 years. In addition, when ancestry and sex are accounted for, the mean errors are reduced for each group (black females, white females, black males, and white males). Lamendin et al. reported an inter-observer error of 9+/-1.8 and 10+/-2 sears from two independent observers. Forty teeth were randomly remeasured from the Terry Collection in order to assess an intra-observer error. From this retest, an intra-observer error of 6.5 years was detected.

  14. Prevalence of Refractive Errors Among School Children in Gondar Town, Northwest Ethiopia

    PubMed Central

    Yared, Assefa Wolde; Belaynew, Wasie Taye; Destaye, Shiferaw; Ayanaw, Tsegaw; Zelalem, Eshete

    2012-01-01

    Purpose: Many children with poor vision due to refractive error remain undiagnosed and perform poorly in school. The situation is worse in the Sub-Saharan Africa, including Ethiopia, and current information is lacking. The objective of this study is to determine the prevalence of refractive error among children enrolled in elementary schools in Gondar town, Ethiopia. Materials and Methods: This was a cross-sectional study of 1852 students in 8 elementary schools. Subjects were selected by multistage random sampling. The study parameters were visual acuity (VA) evaluation and ocular examination. VA was measured by staff optometrists with the Snellen E-chart while students with subnormal vision were examined using pinhole, retinoscopy evaluation and subjective refraction by ophthalmologists. Results: The study cohort was comprised of 45.8% males and 54.2% females from 8 randomly selected elementary schools with a response rate of 93%. Refractive errors in either eye were present in 174 (9.4%) children. Of these, myopia was diagnosed in 55 (31.6%) children in the right and left eyes followed by hyperopia in 46 (26.4%) and 39 (22.4%) in the right and left eyes respectively. Low myopia was the most common refractive error in 61 (49.2%) and 68 (50%) children for the right and left eyes respectively. Conclusions: Refractive error among children is a common problem in Gondar town and needs to be assessed at every health evaluation of school children for timely treatment. PMID:23248538

  15. [Statistical Process Control (SPC) can help prevent treatment errors without increasing costs in radiotherapy].

    PubMed

    Govindarajan, R; Llueguera, E; Melero, A; Molero, J; Soler, N; Rueda, C; Paradinas, C

    2010-01-01

    Statistical Process Control (SPC) was applied to monitor patient set-up in radiotherapy and, when the measured set-up error values indicated a loss of process stability, its root cause was identified and eliminated to prevent set-up errors. Set up errors were measured for medial-lateral (ml), cranial-caudal (cc) and anterior-posterior (ap) dimensions and then the upper control limits were calculated. Once the control limits were known and the range variability was acceptable, treatment set-up errors were monitored using sub-groups of 3 patients, three times each shift. These values were plotted on a control chart in real time. Control limit values showed that the existing variation was acceptable. Set-up errors, measured and plotted on a X chart, helped monitor the set-up process stability and, if and when the stability was lost, treatment was interrupted, the particular cause responsible for the non-random pattern was identified and corrective action was taken before proceeding with the treatment. SPC protocol focuses on controlling the variability due to assignable cause instead of focusing on patient-to-patient variability which normally does not exist. Compared to weekly sampling of set-up error in each and every patient, which may only ensure that just those sampled sessions were set-up correctly, the SPC method enables set-up error prevention in all treatment sessions for all patients and, at the same time, reduces the control costs. Copyright © 2009 SECA. Published by Elsevier Espana. All rights reserved.

  16. Increased detection of Barrett’s esophagus and esophageal dysplasia with adjunctive use of wide-area transepithelial sample with three-dimensional computer-assisted analysis (WATS)

    PubMed Central

    Gross, Seth A; Smith, Michael S; Kaul, Vivek

    2017-01-01

    Background Barrett’s esophagus (BE) and esophageal dysplasia (ED) are frequently missed during screening and surveillance esophagoscopy because of sampling error associated with four-quadrant random forceps biopsy (FB). Aim The aim of this article is to determine if wide-area transepithelial sampling with three-dimensional computer-assisted analysis (WATS) used adjunctively with FB can increase the detection of BE and ED. Methods In this multicenter prospective trial, patients screened for suspected BE and those with known BE undergoing surveillance were enrolled. Patients at 25 community-based practices underwent WATS adjunctively to targeted FB and random four-quadrant FB. Results Of 4203 patients, 594 were diagnosed with BE by FB alone, and 493 additional cases were detected by adding WATS, increasing the overall detection of BE by 83% (493/594, 95% CI 74%–93%). Low-grade dysplasia (LGD) was diagnosed in 26 patients by FB alone, and 23 additional cases were detected by adding WATS, increasing the detection of LGD by 88.5% (23/26, 95% CI 48%–160%). Conclusions Adjunctive use of WATS to FB significantly improves the detection of both BE and ED. Sampling error, an inherent limitation associated with screening and surveillance, can be improved with WATS allowing better informed decisions to be made about the management and subsequent treatment of these patients. PMID:29881608

  17. Errors of five-day mean surface wind and temperature conditions due to inadequate sampling

    NASA Technical Reports Server (NTRS)

    Legler, David M.

    1991-01-01

    Surface meteorological reports of wind components, wind speed, air temperature, and sea-surface temperature from buoys located in equatorial and midlatitude regions are used in a simulation of random sampling to determine errors of the calculated means due to inadequate sampling. Subsampling the data with several different sample sizes leads to estimates of the accuracy of the subsampled means. The number N of random observations needed to compute mean winds with chosen accuracies of 0.5 (N sub 0.5) and 1.0 (N sub 1,0) m/s and mean air and sea surface temperatures with chosen accuracies of 0.1 (N sub 0.1) and 0.2 (N sub 0.2) C were calculated for each 5-day and 30-day period in the buoy datasets. Mean values of N for the various accuracies and datasets are given. A second-order polynomial relation is established between N and the variability of the data record. This relationship demonstrates that for the same accuracy, N increases as the variability of the data record increases. The relationship is also independent of the data source. Volunteer-observing ship data do not satisfy the recommended minimum number of observations for obtaining 0.5 m/s and 0.2 C accuracy for most locations. The effect of having remotely sensed data is discussed.

  18. The correspondence of surface climate parameters with satellite and terrain data

    NASA Technical Reports Server (NTRS)

    Dozier, Jeff; Davis, Frank

    1987-01-01

    One of the goals of the research was to develop a ground sampling stragegy for calibrating remotely sensed measurements of surface climate parameters. The initial sampling strategy involved the stratification of the terrain based on important ancillary surface variables such as slope, exposure, insolation, geology, drainage, fire history, etc. For a spatially heterogeneous population, sampling error is reduced and efficiency increased by stratification of the landscape into more homogeneous sub-areas and by employing periodic random spacing of samples. These concepts were applied in the initial stratification of the study site for the purpose of locating and allocating instrumentation.

  19. How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

    PubMed

    West, Brady T; Sakshaug, Joseph W; Aurelien, Guy Alain S

    2016-01-01

    Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.

  20. How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

    PubMed Central

    West, Brady T.; Sakshaug, Joseph W.; Aurelien, Guy Alain S.

    2016-01-01

    Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data. PMID:27355817

  1. Peak-locking centroid bias in Shack-Hartmann wavefront sensing

    NASA Astrophysics Data System (ADS)

    Anugu, Narsireddy; Garcia, Paulo J. V.; Correia, Carlos M.

    2018-05-01

    Shack-Hartmann wavefront sensing relies on accurate spot centre measurement. Several algorithms were developed with this aim, mostly focused on precision, i.e. minimizing random errors. In the solar and extended scene community, the importance of the accuracy (bias error due to peak-locking, quantization, or sampling) of the centroid determination was identified and solutions proposed. But these solutions only allow partial bias corrections. To date, no systematic study of the bias error was conducted. This article bridges the gap by quantifying the bias error for different correlation peak-finding algorithms and types of sub-aperture images and by proposing a practical solution to minimize its effects. Four classes of sub-aperture images (point source, elongated laser guide star, crowded field, and solar extended scene) together with five types of peak-finding algorithms (1D parabola, the centre of gravity, Gaussian, 2D quadratic polynomial, and pyramid) are considered, in a variety of signal-to-noise conditions. The best performing peak-finding algorithm depends on the sub-aperture image type, but none is satisfactory to both bias and random errors. A practical solution is proposed that relies on the antisymmetric response of the bias to the sub-pixel position of the true centre. The solution decreases the bias by a factor of ˜7 to values of ≲ 0.02 pix. The computational cost is typically twice of current cross-correlation algorithms.

  2. Performance of the likelihood ratio difference (G2 Diff) test for detecting unidimensionality in applications of the multidimensional Rasch model.

    PubMed

    Harrell-Williams, Leigh; Wolfe, Edward W

    2014-01-01

    Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.

  3. Uncertainty Propagation in OMFIT

    NASA Astrophysics Data System (ADS)

    Smith, Sterling; Meneghini, Orso; Sung, Choongki

    2017-10-01

    A rigorous comparison of power balance fluxes and turbulent model fluxes requires the propagation of uncertainties in the kinetic profiles and their derivatives. Making extensive use of the python uncertainties package, the OMFIT framework has been used to propagate covariant uncertainties to provide an uncertainty in the power balance calculation from the ONETWO code, as well as through the turbulent fluxes calculated by the TGLF code. The covariant uncertainties arise from fitting 1D (constant on flux surface) density and temperature profiles and associated random errors with parameterized functions such as a modified tanh. The power balance and model fluxes can then be compared with quantification of the uncertainties. No effort is made at propagating systematic errors. A case study will be shown for the effects of resonant magnetic perturbations on the kinetic profiles and fluxes at the top of the pedestal. A separate attempt at modeling the random errors with Monte Carlo sampling will be compared to the method of propagating the fitting function parameter covariant uncertainties. Work supported by US DOE under DE-FC02-04ER54698, DE-FG2-95ER-54309, DE-SC 0012656.

  4. Honest Importance Sampling with Multiple Markov Chains

    PubMed Central

    Tan, Aixin; Doss, Hani; Hobert, James P.

    2017-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection. PMID:28701855

  5. Honest Importance Sampling with Multiple Markov Chains.

    PubMed

    Tan, Aixin; Doss, Hani; Hobert, James P

    2015-01-01

    Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.

  6. Comparison of Oral Reading Errors between Contextual Sentences and Random Words among Schoolchildren

    ERIC Educational Resources Information Center

    Khalid, Nursyairah Mohd; Buari, Noor Halilah; Chen, Ai-Hong

    2017-01-01

    This paper compares the oral reading errors between the contextual sentences and random words among schoolchildren. Two sets of reading materials were developed to test the oral reading errors in 30 schoolchildren (10.00±1.44 years). Set A was comprised contextual sentences while Set B encompassed random words. The schoolchildren were asked to…

  7. New Directions in Apprentice Selection: Self-Perceived "On the Job" Literacy (Reading) Demands of Apprentices.

    ERIC Educational Resources Information Center

    Edwards, Peter; Gould, Warren

    A study investigated the self-perceived, on-the-job literacy tasks of electrical mechanic apprentices in Victoria, Australia. A random sample of 401 apprentices from 19 locations representing all levels of apprenticeship training were questioned about their reading needs and the consequences of making a reading error in their work. Data were…

  8. Modeling Signal-Noise Processes Supports Student Construction of a Hierarchical Image of Sample

    ERIC Educational Resources Information Center

    Lehrer, Richard

    2017-01-01

    Grade 6 (modal age 11) students invented and revised models of the variability generated as each measured the perimeter of a table in their classroom. To construct models, students represented variability as a linear composite of true measure (signal) and multiple sources of random error. Students revised models by developing sampling…

  9. Random measurement error: Why worry? An example of cardiovascular risk factors.

    PubMed

    Brakenhoff, Timo B; van Smeden, Maarten; Visseren, Frank L J; Groenwold, Rolf H H

    2018-01-01

    With the increased use of data not originally recorded for research, such as routine care data (or 'big data'), measurement error is bound to become an increasingly relevant problem in medical research. A common view among medical researchers on the influence of random measurement error (i.e. classical measurement error) is that its presence leads to some degree of systematic underestimation of studied exposure-outcome relations (i.e. attenuation of the effect estimate). For the common situation where the analysis involves at least one exposure and one confounder, we demonstrate that the direction of effect of random measurement error on the estimated exposure-outcome relations can be difficult to anticipate. Using three example studies on cardiovascular risk factors, we illustrate that random measurement error in the exposure and/or confounder can lead to underestimation as well as overestimation of exposure-outcome relations. We therefore advise medical researchers to refrain from making claims about the direction of effect of measurement error in their manuscripts, unless the appropriate inferential tools are used to study or alleviate the impact of measurement error from the analysis.

  10. Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality.

    PubMed

    Bishara, Anthony J; Hittner, James B

    2015-10-01

    It is more common for educational and psychological data to be nonnormal than to be approximately normal. This tendency may lead to bias and error in point estimates of the Pearson correlation coefficient. In a series of Monte Carlo simulations, the Pearson correlation was examined under conditions of normal and nonnormal data, and it was compared with its major alternatives, including the Spearman rank-order correlation, the bootstrap estimate, the Box-Cox transformation family, and a general normalizing transformation (i.e., rankit), as well as to various bias adjustments. Nonnormality caused the correlation coefficient to be inflated by up to +.14, particularly when the nonnormality involved heavy-tailed distributions. Traditional bias adjustments worsened this problem, further inflating the estimate. The Spearman and rankit correlations eliminated this inflation and provided conservative estimates. Rankit also minimized random error for most sample sizes, except for the smallest samples ( n = 10), where bootstrapping was more effective. Overall, results justify the use of carefully chosen alternatives to the Pearson correlation when normality is violated.

  11. Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality

    PubMed Central

    Hittner, James B.

    2014-01-01

    It is more common for educational and psychological data to be nonnormal than to be approximately normal. This tendency may lead to bias and error in point estimates of the Pearson correlation coefficient. In a series of Monte Carlo simulations, the Pearson correlation was examined under conditions of normal and nonnormal data, and it was compared with its major alternatives, including the Spearman rank-order correlation, the bootstrap estimate, the Box–Cox transformation family, and a general normalizing transformation (i.e., rankit), as well as to various bias adjustments. Nonnormality caused the correlation coefficient to be inflated by up to +.14, particularly when the nonnormality involved heavy-tailed distributions. Traditional bias adjustments worsened this problem, further inflating the estimate. The Spearman and rankit correlations eliminated this inflation and provided conservative estimates. Rankit also minimized random error for most sample sizes, except for the smallest samples (n = 10), where bootstrapping was more effective. Overall, results justify the use of carefully chosen alternatives to the Pearson correlation when normality is violated. PMID:29795841

  12. [Theory, method and application of method R on estimation of (co)variance components].

    PubMed

    Liu, Wen-Zhong

    2004-07-01

    Theory, method and application of Method R on estimation of (co)variance components were reviewed in order to make the method be reasonably used. Estimation requires R values,which are regressions of predicted random effects that are calculated using complete dataset on predicted random effects that are calculated using random subsets of the same data. By using multivariate iteration algorithm based on a transformation matrix,and combining with the preconditioned conjugate gradient to solve the mixed model equations, the computation efficiency of Method R is much improved. Method R is computationally inexpensive,and the sampling errors and approximate credible intervals of estimates can be obtained. Disadvantages of Method R include a larger sampling variance than other methods for the same data,and biased estimates in small datasets. As an alternative method, Method R can be used in larger datasets. It is necessary to study its theoretical properties and broaden its application range further.

  13. Comparison of random regression models with Legendre polynomials and linear splines for production traits and somatic cell score of Canadian Holstein cows.

    PubMed

    Bohmanova, J; Miglior, F; Jamrozik, J; Misztal, I; Sullivan, P G

    2008-09-01

    A random regression model with both random and fixed regressions fitted by Legendre polynomials of order 4 was compared with 3 alternative models fitting linear splines with 4, 5, or 6 knots. The effects common for all models were a herd-test-date effect, fixed regressions on days in milk (DIM) nested within region-age-season of calving class, and random regressions for additive genetic and permanent environmental effects. Data were test-day milk, fat and protein yields, and SCS recorded from 5 to 365 DIM during the first 3 lactations of Canadian Holstein cows. A random sample of 50 herds consisting of 96,756 test-day records was generated to estimate variance components within a Bayesian framework via Gibbs sampling. Two sets of genetic evaluations were subsequently carried out to investigate performance of the 4 models. Models were compared by graphical inspection of variance functions, goodness of fit, error of prediction of breeding values, and stability of estimated breeding values. Models with splines gave lower estimates of variances at extremes of lactations than the model with Legendre polynomials. Differences among models in goodness of fit measured by percentages of squared bias, correlations between predicted and observed records, and residual variances were small. The deviance information criterion favored the spline model with 6 knots. Smaller error of prediction and higher stability of estimated breeding values were achieved by using spline models with 5 and 6 knots compared with the model with Legendre polynomials. In general, the spline model with 6 knots had the best overall performance based upon the considered model comparison criteria.

  14. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

    PubMed

    Ma, Li; Fan, Suohai

    2017-03-14

    The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization. We propose the CURE-SMOTE algorithm for the imbalanced data classification problem. Experiments on imbalanced UCI data reveal that the combination of Clustering Using Representatives (CURE) enhances the original synthetic minority oversampling technique (SMOTE) algorithms effectively compared with the classification results on the original data using random sampling, Borderline-SMOTE1, safe-level SMOTE, C-SMOTE, and k-means-SMOTE. Additionally, the hybrid RF (random forests) algorithm has been proposed for feature selection and parameter optimization, which uses the minimum out of bag (OOB) data error as its objective function. Simulation results on binary and higher-dimensional data indicate that the proposed hybrid RF algorithms, hybrid genetic-random forests algorithm, hybrid particle swarm-random forests algorithm and hybrid fish swarm-random forests algorithm can achieve the minimum OOB error and show the best generalization ability. The training set produced from the proposed CURE-SMOTE algorithm is closer to the original data distribution because it contains minimal noise. Thus, better classification results are produced from this feasible and effective algorithm. Moreover, the hybrid algorithm's F-value, G-mean, AUC and OOB scores demonstrate that they surpass the performance of the original RF algorithm. Hence, this hybrid algorithm provides a new way to perform feature selection and parameter optimization.

  15. Implications of clinical trial design on sample size requirements.

    PubMed

    Leon, Andrew C

    2008-07-01

    The primary goal in designing a randomized controlled clinical trial (RCT) is to minimize bias in the estimate of treatment effect. Randomized group assignment, double-blinded assessments, and control or comparison groups reduce the risk of bias. The design must also provide sufficient statistical power to detect a clinically meaningful treatment effect and maintain a nominal level of type I error. An attempt to integrate neurocognitive science into an RCT poses additional challenges. Two particularly relevant aspects of such a design often receive insufficient attention in an RCT. Multiple outcomes inflate type I error, and an unreliable assessment process introduces bias and reduces statistical power. Here we describe how both unreliability and multiple outcomes can increase the study costs and duration and reduce the feasibility of the study. The objective of this article is to consider strategies that overcome the problems of unreliability and multiplicity.

  16. Auditing the Assignments of Top-Level Semantic Types in the UMLS Semantic Network to UMLS Concepts

    PubMed Central

    He, Zhe; Perl, Yehoshua; Elhanan, Gai; Chen, Yan; Geller, James; Bian, Jiang

    2018-01-01

    The Unified Medical Language System (UMLS) is an important terminological system. By the policy of its curators, each concept of the UMLS should be assigned the most specific Semantic Types (STs) in the UMLS Semantic Network (SN). Hence, the Semantic Types of most UMLS concepts are assigned at or near the bottom (leaves) of the UMLS Semantic Network. While most ST assignments are correct, some errors do occur. Therefore, Quality Assurance efforts of UMLS curators for ST assignments should concentrate on automatically detected sets of UMLS concepts with higher error rates than random sets. In this paper, we investigate the assignments of top-level semantic types in the UMLS semantic network to concepts, identify potential erroneous assignments, define four categories of errors, and thus provide assistance to curators of the UMLS to avoid these assignments errors. Human experts analyzed samples of concepts assigned 10 of the top-level semantic types and categorized the erroneous ST assignments into these four logical categories. Two thirds of the concepts assigned these 10 top-level semantic types are erroneous. Our results demonstrate that reviewing top-level semantic type assignments to concepts provides an effective way for UMLS quality assurance, comparing to reviewing a random selection of semantic type assignments. PMID:29375930

  17. Auditing the Assignments of Top-Level Semantic Types in the UMLS Semantic Network to UMLS Concepts.

    PubMed

    He, Zhe; Perl, Yehoshua; Elhanan, Gai; Chen, Yan; Geller, James; Bian, Jiang

    2017-11-01

    The Unified Medical Language System (UMLS) is an important terminological system. By the policy of its curators, each concept of the UMLS should be assigned the most specific Semantic Types (STs) in the UMLS Semantic Network (SN). Hence, the Semantic Types of most UMLS concepts are assigned at or near the bottom (leaves) of the UMLS Semantic Network. While most ST assignments are correct, some errors do occur. Therefore, Quality Assurance efforts of UMLS curators for ST assignments should concentrate on automatically detected sets of UMLS concepts with higher error rates than random sets. In this paper, we investigate the assignments of top-level semantic types in the UMLS semantic network to concepts, identify potential erroneous assignments, define four categories of errors, and thus provide assistance to curators of the UMLS to avoid these assignments errors. Human experts analyzed samples of concepts assigned 10 of the top-level semantic types and categorized the erroneous ST assignments into these four logical categories. Two thirds of the concepts assigned these 10 top-level semantic types are erroneous. Our results demonstrate that reviewing top-level semantic type assignments to concepts provides an effective way for UMLS quality assurance, comparing to reviewing a random selection of semantic type assignments.

  18. The Performance of the Date-Randomization Test in Phylogenetic Analyses of Time-Structured Virus Data.

    PubMed

    Duchêne, Sebastián; Duchêne, David; Holmes, Edward C; Ho, Simon Y W

    2015-07-01

    Rates and timescales of viral evolution can be estimated using phylogenetic analyses of time-structured molecular sequences. This involves the use of molecular-clock methods, calibrated by the sampling times of the viral sequences. However, the spread of these sampling times is not always sufficient to allow the substitution rate to be estimated accurately. We conducted Bayesian phylogenetic analyses of simulated virus data to evaluate the performance of the date-randomization test, which is sometimes used to investigate whether time-structured data sets have temporal signal. An estimate of the substitution rate passes this test if its mean does not fall within the 95% credible intervals of rate estimates obtained using replicate data sets in which the sampling times have been randomized. We find that the test sometimes fails to detect rate estimates from data with no temporal signal. This error can be minimized by using a more conservative criterion, whereby the 95% credible interval of the estimate with correct sampling times should not overlap with those obtained with randomized sampling times. We also investigated the behavior of the test when the sampling times are not uniformly distributed throughout the tree, which sometimes occurs in empirical data sets. The test performs poorly in these circumstances, such that a modification to the randomization scheme is needed. Finally, we illustrate the behavior of the test in analyses of nucleotide sequences of cereal yellow dwarf virus. Our results validate the use of the date-randomization test and allow us to propose guidelines for interpretation of its results. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Rank score and permutation testing alternatives for regression quantile estimates

    USGS Publications Warehouse

    Cade, B.S.; Richards, J.D.; Mielke, P.W.

    2006-01-01

    Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.

  20. On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models.

    PubMed

    Magis, David

    2014-11-01

    In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses on the estimation process. The computation of asymptotic (i.e., large-sample) standard errors (ASE) for these robust estimators, however, has not yet been fully considered. This paper focuses on a broad class of robust ability estimators, defined by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASEs, are then compared in a simulation study by generating random guessing disturbances. It is concluded that both the estimators and their ASE perform similarly in the absence of random guessing, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in the presence of random guessing with large impact on the item response process. © 2013 The British Psychological Society.

  1. Statistical considerations for grain-size analyses of tills

    USGS Publications Warehouse

    Jacobs, A.M.

    1971-01-01

    Relative percentages of sand, silt, and clay from samples of the same till unit are not identical because of different lithologies in the source areas, sorting in transport, random variation, and experimental error. Random variation and experimental error can be isolated from the other two as follows. For each particle-size class of each till unit, a standard population is determined by using a normally distributed, representative group of data. New measurements are compared with the standard population and, if they compare satisfactorily, the experimental error is not significant and random variation is within the expected range for the population. The outcome of the comparison depends on numerical criteria derived from a graphical method rather than on a more commonly used one-way analysis of variance with two treatments. If the number of samples and the standard deviation of the standard population are substituted in a t-test equation, a family of hyperbolas is generated, each of which corresponds to a specific number of subsamples taken from each new sample. The axes of the graphs of the hyperbolas are the standard deviation of new measurements (horizontal axis) and the difference between the means of the new measurements and the standard population (vertical axis). The area between the two branches of each hyperbola corresponds to a satisfactory comparison between the new measurements and the standard population. Measurements from a new sample can be tested by plotting their standard deviation vs. difference in means on axes containing a hyperbola corresponding to the specific number of subsamples used. If the point lies between the branches of the hyperbola, the measurements are considered reliable. But if the point lies outside this region, the measurements are repeated. Because the critical segment of the hyperbola is approximately a straight line parallel to the horizontal axis, the test is simplified to a comparison between the means of the standard population and the means of the subsample. The minimum number of subsamples required to prove significant variation between samples caused by different lithologies in the source areas and sorting in transport can be determined directly from the graphical method. The minimum number of subsamples required is the maximum number to be run for economy of effort. ?? 1971 Plenum Publishing Corporation.

  2. A case study of the effects of random errors in rawinsonde data on computations of ageostrophic winds

    NASA Technical Reports Server (NTRS)

    Moore, J. T.

    1985-01-01

    Data input for the AVE-SESAME I experiment are utilized to describe the effects of random errors in rawinsonde data on the computation of ageostrophic winds. Computer-generated random errors for wind direction and speed and temperature are introduced into the station soundings at 25 mb intervals from which isentropic data sets are created. Except for the isallobaric and the local wind tendency, all winds are computed for Apr. 10, 1979 at 2000 GMT. Divergence fields reveal that the isallobaric and inertial-geostrophic-advective divergences are less affected by rawinsonde random errors than the divergence of the local wind tendency or inertial-advective winds.

  3. The preliminary development and testing of a global trigger tool to detect error and patient harm in primary-care records.

    PubMed

    de Wet, C; Bowie, P

    2009-04-01

    A multi-method strategy has been proposed to understand and improve the safety of primary care. The trigger tool is a relatively new method that has shown promise in American and secondary healthcare settings. It involves the focused review of a random sample of patient records using a series of "triggers" that alert reviewers to potential errors and previously undetected adverse events. To develop and test a global trigger tool to detect errors and adverse events in primary-care records. Trigger tool development was informed by previous research and content validated by expert opinion. The tool was applied by trained reviewers who worked in pairs to conduct focused audits of 100 randomly selected electronic patient records in each of five urban general practices in central Scotland. Review of 500 records revealed 2251 consultations and 730 triggers. An adverse event was found in 47 records (9.4%), indicating that harm occurred at a rate of one event per 48 consultations. Of these, 27 were judged to be preventable (42%). A further 17 records (3.4%) contained evidence of a potential adverse event. Harm severity was low to moderate for most patients (82.9%). Error and harm rates were higher in those aged > or =60 years, and most were medication-related (59%). The trigger tool was successful in identifying undetected patient harm in primary-care records and may be the most reliable method for achieving this. However, the feasibility of its routine application is open to question. The tool may have greater utility as a research rather than an audit technique. Further testing in larger, representative study samples is required.

  4. Accelerated 1 H MRSI using randomly undersampled spiral-based k-space trajectories.

    PubMed

    Chatnuntawech, Itthi; Gagoski, Borjan; Bilgic, Berkin; Cauley, Stephen F; Setsompop, Kawin; Adalsteinsson, Elfar

    2014-07-30

    To develop and evaluate the performance of an acquisition and reconstruction method for accelerated MR spectroscopic imaging (MRSI) through undersampling of spiral trajectories. A randomly undersampled spiral acquisition and sensitivity encoding (SENSE) with total variation (TV) regularization, random SENSE+TV, is developed and evaluated on single-slice numerical phantom, in vivo single-slice MRSI, and in vivo three-dimensional (3D)-MRSI at 3 Tesla. Random SENSE+TV was compared with five alternative methods for accelerated MRSI. For the in vivo single-slice MRSI, random SENSE+TV yields up to 2.7 and 2 times reduction in root-mean-square error (RMSE) of reconstructed N-acetyl aspartate (NAA), creatine, and choline maps, compared with the denoised fully sampled and uniformly undersampled SENSE+TV methods with the same acquisition time, respectively. For the in vivo 3D-MRSI, random SENSE+TV yields up to 1.6 times reduction in RMSE, compared with uniform SENSE+TV. Furthermore, by using random SENSE+TV, we have demonstrated on the in vivo single-slice and 3D-MRSI that acceleration factors of 4.5 and 4 are achievable with the same quality as the fully sampled data, as measured by RMSE of reconstructed NAA map, respectively. With the same scan time, random SENSE+TV yields lower RMSEs of metabolite maps than other methods evaluated. Random SENSE+TV achieves up to 4.5-fold acceleration with comparable data quality as the fully sampled acquisition. Magn Reson Med, 2014. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.

  5. Network problem threshold

    NASA Technical Reports Server (NTRS)

    Gejji, Raghvendra, R.

    1992-01-01

    Network transmission errors such as collisions, CRC errors, misalignment, etc. are statistical in nature. Although errors can vary randomly, a high level of errors does indicate specific network problems, e.g. equipment failure. In this project, we have studied the random nature of collisions theoretically as well as by gathering statistics, and established a numerical threshold above which a network problem is indicated with high probability.

  6. Robust Least-Squares Support Vector Machine With Minimization of Mean and Variance of Modeling Error.

    PubMed

    Lu, Xinjiang; Liu, Wenbo; Zhou, Chuang; Huang, Minghui

    2017-06-13

    The least-squares support vector machine (LS-SVM) is a popular data-driven modeling method and has been successfully applied to a wide range of applications. However, it has some disadvantages, including being ineffective at handling non-Gaussian noise as well as being sensitive to outliers. In this paper, a robust LS-SVM method is proposed and is shown to have more reliable performance when modeling a nonlinear system under conditions where Gaussian or non-Gaussian noise is present. The construction of a new objective function allows for a reduction of the mean of the modeling error as well as the minimization of its variance, and it does not constrain the mean of the modeling error to zero. This differs from the traditional LS-SVM, which uses a worst-case scenario approach in order to minimize the modeling error and constrains the mean of the modeling error to zero. In doing so, the proposed method takes the modeling error distribution information into consideration and is thus less conservative and more robust in regards to random noise. A solving method is then developed in order to determine the optimal parameters for the proposed robust LS-SVM. An additional analysis indicates that the proposed LS-SVM gives a smaller weight to a large-error training sample and a larger weight to a small-error training sample, and is thus more robust than the traditional LS-SVM. The effectiveness of the proposed robust LS-SVM is demonstrated using both artificial and real life cases.

  7. Using multivariate generalizability theory to assess the effect of content stratification on the reliability of a performance assessment.

    PubMed

    Keller, Lisa A; Clauser, Brian E; Swanson, David B

    2010-12-01

    In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates of reliability may not be accurate. For tests built according to a table of specifications, tasks are randomly sampled from different strata (content domains, skill areas, etc.). If these strata remain fixed in the test construction process, ignoring this stratification in the reliability analysis results in an underestimate of "parallel forms" reliability, and an overestimate of the person-by-task component. This research explores the effect of representing and misrepresenting the stratification appropriately in estimation of reliability and the standard error of measurement. Both multivariate and univariate generalizability studies are reported. Results indicate that the proper specification of the analytic design is essential in yielding the proper information both about the generalizability of the assessment and the standard error of measurement. Further, illustrative D studies present the effect under a variety of situations and test designs. Additional benefits of multivariate generalizability theory in test design and evaluation are also discussed.

  8. SU-E-T-769: T-Test Based Prior Error Estimate and Stopping Criterion for Monte Carlo Dose Calculation in Proton Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, X; Gao, H; Schuemann, J

    2015-06-15

    Purpose: The Monte Carlo (MC) method is a gold standard for dose calculation in radiotherapy. However, it is not a priori clear how many particles need to be simulated to achieve a given dose accuracy. Prior error estimate and stopping criterion are not well established for MC. This work aims to fill this gap. Methods: Due to the statistical nature of MC, our approach is based on one-sample t-test. We design the prior error estimate method based on the t-test, and then use this t-test based error estimate for developing a simulation stopping criterion. The three major components are asmore » follows.First, the source particles are randomized in energy, space and angle, so that the dose deposition from a particle to the voxel is independent and identically distributed (i.i.d.).Second, a sample under consideration in the t-test is the mean value of dose deposition to the voxel by sufficiently large number of source particles. Then according to central limit theorem, the sample as the mean value of i.i.d. variables is normally distributed with the expectation equal to the true deposited dose.Third, the t-test is performed with the null hypothesis that the difference between sample expectation (the same as true deposited dose) and on-the-fly calculated mean sample dose from MC is larger than a given error threshold, in addition to which users have the freedom to specify confidence probability and region of interest in the t-test based stopping criterion. Results: The method is validated for proton dose calculation. The difference between the MC Result based on the t-test prior error estimate and the statistical Result by repeating numerous MC simulations is within 1%. Conclusion: The t-test based prior error estimate and stopping criterion are developed for MC and validated for proton dose calculation. Xiang Hong and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000) and the Shanghai Pujiang Talent Program (#14PJ1404500)« less

  9. Accelerated Brain DCE-MRI Using Iterative Reconstruction With Total Generalized Variation Penalty for Quantitative Pharmacokinetic Analysis: A Feasibility Study.

    PubMed

    Wang, Chunhao; Yin, Fang-Fang; Kirkpatrick, John P; Chang, Zheng

    2017-08-01

    To investigate the feasibility of using undersampled k-space data and an iterative image reconstruction method with total generalized variation penalty in the quantitative pharmacokinetic analysis for clinical brain dynamic contrast-enhanced magnetic resonance imaging. Eight brain dynamic contrast-enhanced magnetic resonance imaging scans were retrospectively studied. Two k-space sparse sampling strategies were designed to achieve a simulated image acquisition acceleration factor of 4. They are (1) a golden ratio-optimized 32-ray radial sampling profile and (2) a Cartesian-based random sampling profile with spatiotemporal-regularized sampling density constraints. The undersampled data were reconstructed to yield images using the investigated reconstruction technique. In quantitative pharmacokinetic analysis on a voxel-by-voxel basis, the rate constant K trans in the extended Tofts model and blood flow F B and blood volume V B from the 2-compartment exchange model were analyzed. Finally, the quantitative pharmacokinetic parameters calculated from the undersampled data were compared with the corresponding calculated values from the fully sampled data. To quantify each parameter's accuracy calculated using the undersampled data, error in volume mean, total relative error, and cross-correlation were calculated. The pharmacokinetic parameter maps generated from the undersampled data appeared comparable to the ones generated from the original full sampling data. Within the region of interest, most derived error in volume mean values in the region of interest was about 5% or lower, and the average error in volume mean of all parameter maps generated through either sampling strategy was about 3.54%. The average total relative error value of all parameter maps in region of interest was about 0.115, and the average cross-correlation of all parameter maps in region of interest was about 0.962. All investigated pharmacokinetic parameters had no significant differences between the result from original data and the reduced sampling data. With sparsely sampled k-space data in simulation of accelerated acquisition by a factor of 4, the investigated dynamic contrast-enhanced magnetic resonance imaging pharmacokinetic parameters can accurately estimate the total generalized variation-based iterative image reconstruction method for reliable clinical application.

  10. Predicting active-layer soil thickness using topographic variables at a small watershed scale

    PubMed Central

    Li, Aidi; Tan, Xing; Wu, Wei; Liu, Hongbin; Zhu, Jie

    2017-01-01

    Knowledge about the spatial distribution of active-layer (AL) soil thickness is indispensable for ecological modeling, precision agriculture, and land resource management. However, it is difficult to obtain the details on AL soil thickness by using conventional soil survey method. In this research, the objective is to investigate the possibility and accuracy of mapping the spatial distribution of AL soil thickness through random forest (RF) model by using terrain variables at a small watershed scale. A total of 1113 soil samples collected from the slope fields were randomly divided into calibration (770 soil samples) and validation (343 soil samples) sets. Seven terrain variables including elevation, aspect, relative slope position, valley depth, flow path length, slope height, and topographic wetness index were derived from a digital elevation map (30 m). The RF model was compared with multiple linear regression (MLR), geographically weighted regression (GWR) and support vector machines (SVM) approaches based on the validation set. Model performance was evaluated by precision criteria of mean error (ME), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). Comparative results showed that RF outperformed MLR, GWR and SVM models. The RF gave better values of ME (0.39 cm), MAE (7.09 cm), and RMSE (10.85 cm) and higher R2 (62%). The sensitivity analysis demonstrated that the DEM had less uncertainty than the AL soil thickness. The outcome of the RF model indicated that elevation, flow path length and valley depth were the most important factors affecting the AL soil thickness variability across the watershed. These results demonstrated the RF model is a promising method for predicting spatial distribution of AL soil thickness using terrain parameters. PMID:28877196

  11. Spatial Variation of Soil Lead in an Urban Community Garden: Implications for Risk-Based Sampling.

    PubMed

    Bugdalski, Lauren; Lemke, Lawrence D; McElmurry, Shawn P

    2014-01-01

    Soil lead pollution is a recalcitrant problem in urban areas resulting from a combination of historical residential, industrial, and transportation practices. The emergence of urban gardening movements in postindustrial cities necessitates accurate assessment of soil lead levels to ensure safe gardening. In this study, we examined small-scale spatial variability of soil lead within a 15 × 30 m urban garden plot established on two adjacent residential lots located in Detroit, Michigan, USA. Eighty samples collected using a variably spaced sampling grid were analyzed for total, fine fraction (less than 250 μm), and bioaccessible soil lead. Measured concentrations varied at sampling scales of 1-10 m and a hot spot exceeding 400 ppm total soil lead was identified in the northwest portion of the site. An interpolated map of total lead was treated as an exhaustive data set, and random sampling was simulated to generate Monte Carlo distributions and evaluate alternative sampling strategies intended to estimate the average soil lead concentration or detect hot spots. Increasing the number of individual samples decreases the probability of overlooking the hot spot (type II error). However, the practice of compositing and averaging samples decreased the probability of overestimating the mean concentration (type I error) at the expense of increasing the chance for type II error. The results reported here suggest a need to reconsider U.S. Environmental Protection Agency sampling objectives and consequent guidelines for reclaimed city lots where soil lead distributions are expected to be nonuniform. © 2013 Society for Risk Analysis.

  12. Learning Bayesian Networks from Correlated Data

    NASA Astrophysics Data System (ADS)

    Bae, Harold; Monti, Stefano; Montano, Monty; Steinberg, Martin H.; Perls, Thomas T.; Sebastiani, Paola

    2016-05-01

    Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.

  13. Verification of unfold error estimates in the unfold operator code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fehl, D.L.; Biggs, F.

    Spectral unfolding is an inverse mathematical operation that attempts to obtain spectral source information from a set of response functions and data measurements. Several unfold algorithms have appeared over the past 30 years; among them is the unfold operator (UFO) code written at Sandia National Laboratories. In addition to an unfolded spectrum, the UFO code also estimates the unfold uncertainty (error) induced by estimated random uncertainties in the data. In UFO the unfold uncertainty is obtained from the error matrix. This built-in estimate has now been compared to error estimates obtained by running the code in a Monte Carlo fashionmore » with prescribed data distributions (Gaussian deviates). In the test problem studied, data were simulated from an arbitrarily chosen blackbody spectrum (10 keV) and a set of overlapping response functions. The data were assumed to have an imprecision of 5{percent} (standard deviation). One hundred random data sets were generated. The built-in estimate of unfold uncertainty agreed with the Monte Carlo estimate to within the statistical resolution of this relatively small sample size (95{percent} confidence level). A possible 10{percent} bias between the two methods was unresolved. The Monte Carlo technique is also useful in underdetermined problems, for which the error matrix method does not apply. UFO has been applied to the diagnosis of low energy x rays emitted by Z-pinch and ion-beam driven hohlraums. {copyright} {ital 1997 American Institute of Physics.}« less

  14. Simulation of wave propagation in three-dimensional random media

    NASA Astrophysics Data System (ADS)

    Coles, Wm. A.; Filice, J. P.; Frehlich, R. G.; Yadlowsky, M.

    1995-04-01

    Quantitative error analyses for the simulation of wave propagation in three-dimensional random media, when narrow angular scattering is assumed, are presented for plane-wave and spherical-wave geometry. This includes the errors that result from finite grid size, finite simulation dimensions, and the separation of the two-dimensional screens along the propagation direction. Simple error scalings are determined for power-law spectra of the random refractive indices of the media. The effects of a finite inner scale are also considered. The spatial spectra of the intensity errors are calculated and compared with the spatial spectra of

  15. Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys.

    PubMed

    Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R

    2017-09-14

    While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.

  16. Scattering from binary optics

    NASA Technical Reports Server (NTRS)

    Ricks, Douglas W.

    1993-01-01

    There are a number of sources of scattering in binary optics: etch depth errors, line edge errors, quantization errors, roughness, and the binary approximation to the ideal surface. These sources of scattering can be systematic (deterministic) or random. In this paper, scattering formulas for both systematic and random errors are derived using Fourier optics. These formulas can be used to explain the results of scattering measurements and computer simulations.

  17. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments

    PubMed Central

    2013-01-01

    Background Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. Results To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations. The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. Conclusions We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs. PMID:24160725

  18. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments.

    PubMed

    Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello

    2013-10-26

    Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.

  19. Generalized optimal design for two-arm, randomized phase II clinical trials with endpoints from the exponential dispersion family.

    PubMed

    Jiang, Wei; Mahnken, Jonathan D; He, Jianghua; Mayo, Matthew S

    2016-11-01

    For two-arm randomized phase II clinical trials, previous literature proposed an optimal design that minimizes the total sample sizes subject to multiple constraints on the standard errors of the estimated event rates and their difference. The original design is limited to trials with dichotomous endpoints. This paper extends the original approach to be applicable to phase II clinical trials with endpoints from the exponential dispersion family distributions. The proposed optimal design minimizes the total sample sizes needed to provide estimates of population means of both arms and their difference with pre-specified precision. Its applications on data from specific distribution families are discussed under multiple design considerations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  20. The effects of recall errors and of selection bias in epidemiologic studies of mobile phone use and cancer risk.

    PubMed

    Vrijheid, Martine; Deltour, Isabelle; Krewski, Daniel; Sanchez, Marie; Cardis, Elisabeth

    2006-07-01

    This paper examines the effects of systematic and random errors in recall and of selection bias in case-control studies of mobile phone use and cancer. These sensitivity analyses are based on Monte-Carlo computer simulations and were carried out within the INTERPHONE Study, an international collaborative case-control study in 13 countries. Recall error scenarios simulated plausible values of random and systematic, non-differential and differential recall errors in amount of mobile phone use reported by study subjects. Plausible values for the recall error were obtained from validation studies. Selection bias scenarios assumed varying selection probabilities for cases and controls, mobile phone users, and non-users. Where possible these selection probabilities were based on existing information from non-respondents in INTERPHONE. Simulations used exposure distributions based on existing INTERPHONE data and assumed varying levels of the true risk of brain cancer related to mobile phone use. Results suggest that random recall errors of plausible levels can lead to a large underestimation in the risk of brain cancer associated with mobile phone use. Random errors were found to have larger impact than plausible systematic errors. Differential errors in recall had very little additional impact in the presence of large random errors. Selection bias resulting from underselection of unexposed controls led to J-shaped exposure-response patterns, with risk apparently decreasing at low to moderate exposure levels. The present results, in conjunction with those of the validation studies conducted within the INTERPHONE study, will play an important role in the interpretation of existing and future case-control studies of mobile phone use and cancer risk, including the INTERPHONE study.

  1. Development and Evaluation of Algorithms for Breath Alcohol Screening.

    PubMed

    Ljungblad, Jonas; Hök, Bertil; Ekström, Mikael

    2016-04-01

    Breath alcohol screening is important for traffic safety, access control and other areas of health promotion. A family of sensor devices useful for these purposes is being developed and evaluated. This paper is focusing on algorithms for the determination of breath alcohol concentration in diluted breath samples using carbon dioxide to compensate for the dilution. The examined algorithms make use of signal averaging, weighting and personalization to reduce estimation errors. Evaluation has been performed by using data from a previously conducted human study. It is concluded that these features in combination will significantly reduce the random error compared to the signal averaging algorithm taken alone.

  2. Generating equilateral random polygons in confinement

    NASA Astrophysics Data System (ADS)

    Diao, Y.; Ernst, C.; Montemayor, A.; Ziegler, U.

    2011-10-01

    One challenging problem in biology is to understand the mechanism of DNA packing in a confined volume such as a cell. It is known that confined circular DNA is often knotted and hence the topology of the extracted (and relaxed) circular DNA can be used as a probe of the DNA packing mechanism. However, in order to properly estimate the topological properties of the confined circular DNA structures using mathematical models, it is necessary to generate large ensembles of simulated closed chains (i.e. polygons) of equal edge lengths that are confined in a volume such as a sphere of certain fixed radius. Finding efficient algorithms that properly sample the space of such confined equilateral random polygons is a difficult problem. In this paper, we propose a method that generates confined equilateral random polygons based on their probability distribution. This method requires the creation of a large database initially. However, once the database has been created, a confined equilateral random polygon of length n can be generated in linear time in terms of n. The errors introduced by the method can be controlled and reduced by the refinement of the database. Furthermore, our numerical simulations indicate that these errors are unbiased and tend to cancel each other in a long polygon.

  3. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.

    PubMed

    Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N

    2017-06-21

    Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction, classifier performance (least to greatest error) was ranked as follows: SVM, Random Forest, Naïve Bayes, sPLS-DA, Neural Networks, PLS-DA and k -NN classifiers. When non-normal error distributions were introduced, the performance of PLS-DA and k -NN classifiers deteriorated further relative to the remaining techniques. Over the real datasets, a trend of better performance of SVM and Random Forest classifier performance was observed.

  4. Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates.

    PubMed

    Fottrell, Edward; Byass, Peter; Berhane, Yemane

    2008-03-25

    As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity. This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality. Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data. The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data. The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings.

  5. Ensemble Bayesian forecasting system Part I: Theory and algorithms

    NASA Astrophysics Data System (ADS)

    Herr, Henry D.; Krzysztofowicz, Roman

    2015-05-01

    The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of predictand, possesses a Bayesian coherence property, constitutes a random sample of the predictand, and has an acceptable sampling error-which makes it suitable for rational decision making under uncertainty.

  6. ELLIPTICAL WEIGHTED HOLICs FOR WEAK LENSING SHEAR MEASUREMENT. III. THE EFFECT OF RANDOM COUNT NOISE ON IMAGE MOMENTS IN WEAK LENSING ANALYSIS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Okura, Yuki; Futamase, Toshifumi, E-mail: yuki.okura@nao.ac.jp, E-mail: tof@astr.tohoku.ac.jp

    This is the third paper on the improvement of systematic errors in weak lensing analysis using an elliptical weight function, referred to as E-HOLICs. In previous papers, we succeeded in avoiding errors that depend on the ellipticity of the background image. In this paper, we investigate the systematic error that depends on the signal-to-noise ratio of the background image. We find that the origin of this error is the random count noise that comes from the Poisson noise of sky counts. The random count noise makes additional moments and centroid shift error, and those first-order effects are canceled in averaging,more » but the second-order effects are not canceled. We derive the formulae that correct this systematic error due to the random count noise in measuring the moments and ellipticity of the background image. The correction formulae obtained are expressed as combinations of complex moments of the image, and thus can correct the systematic errors caused by each object. We test their validity using a simulated image and find that the systematic error becomes less than 1% in the measured ellipticity for objects with an IMCAT significance threshold of {nu} {approx} 11.7.« less

  7. Random Forest-Based Recognition of Isolated Sign Language Subwords Using Data from Accelerometers and Surface Electromyographic Sensors.

    PubMed

    Su, Ruiliang; Chen, Xiang; Cao, Shuai; Zhang, Xu

    2016-01-14

    Sign language recognition (SLR) has been widely used for communication amongst the hearing-impaired and non-verbal community. This paper proposes an accurate and robust SLR framework using an improved decision tree as the base classifier of random forests. This framework was used to recognize Chinese sign language subwords using recordings from a pair of portable devices worn on both arms consisting of accelerometers (ACC) and surface electromyography (sEMG) sensors. The experimental results demonstrated the validity of the proposed random forest-based method for recognition of Chinese sign language (CSL) subwords. With the proposed method, 98.25% average accuracy was obtained for the classification of a list of 121 frequently used CSL subwords. Moreover, the random forests method demonstrated a superior performance in resisting the impact of bad training samples. When the proportion of bad samples in the training set reached 50%, the recognition error rate of the random forest-based method was only 10.67%, while that of a single decision tree adopted in our previous work was almost 27.5%. Our study offers a practical way of realizing a robust and wearable EMG-ACC-based SLR systems.

  8. Measurement of breast-tissue x-ray attenuation by spectral mammography: solid lesions

    NASA Astrophysics Data System (ADS)

    Fredenberg, Erik; Kilburn-Toppin, Fleur; Willsher, Paula; Moa, Elin; Danielsson, Mats; Dance, David R.; Young, Kenneth C.; Wallis, Matthew G.

    2016-04-01

    Knowledge of x-ray attenuation is essential for developing and evaluating x-ray imaging technologies. For instance, techniques to distinguish between cysts and solid tumours at mammography screening would be highly desirable to reduce recalls, but the development requires knowledge of the x-ray attenuation for cysts and tumours. We have previously measured the attenuation of cyst fluid using photon-counting spectral mammography. Data on x-ray attenuation for solid breast lesions are available in the literature, but cover a relatively wide range, likely caused by natural spread between samples, random measurement errors, and different experimental conditions. In this study, we have adapted a previously developed spectral method to measure the linear attenuation of solid breast lesions. A total of 56 malignant and 5 benign lesions were included in the study. The samples were placed in a holder that allowed for thickness measurement. Spectral (energy-resolved) images of the samples were acquired and the image signal was mapped to equivalent thicknesses of two known reference materials, which can be used to derive the x-ray attenuation as a function of energy. The spread in equivalent material thicknesses was relatively large between samples, which is likely to be caused mainly by natural variation and only to a minor extent by random measurement errors and sample inhomogeneity. No significant difference in attenuation was found between benign and malignant solid lesions. The separation between cyst-fluid and tumour attenuation was, however, significant, which suggests it may be possible to distinguish cystic from solid breast lesions, and the results lay the groundwork for a clinical trial. In addition, the study adds a relatively large sample set to the published data and may contribute to a reduction in the overall uncertainty in the literature.

  9. Study on the algorithm of computational ghost imaging based on discrete fourier transform measurement matrix

    NASA Astrophysics Data System (ADS)

    Zhang, Leihong; Liang, Dong; Li, Bei; Kang, Yi; Pan, Zilan; Zhang, Dawei; Gao, Xiumin; Ma, Xiuhua

    2016-07-01

    On the basis of analyzing the cosine light field with determined analytic expression and the pseudo-inverse method, the object is illuminated by a presetting light field with a determined discrete Fourier transform measurement matrix, and the object image is reconstructed by the pseudo-inverse method. The analytic expression of the algorithm of computational ghost imaging based on discrete Fourier transform measurement matrix is deduced theoretically, and compared with the algorithm of compressive computational ghost imaging based on random measurement matrix. The reconstruction process and the reconstruction error are analyzed. On this basis, the simulation is done to verify the theoretical analysis. When the sampling measurement number is similar to the number of object pixel, the rank of discrete Fourier transform matrix is the same as the one of the random measurement matrix, the PSNR of the reconstruction image of FGI algorithm and PGI algorithm are similar, the reconstruction error of the traditional CGI algorithm is lower than that of reconstruction image based on FGI algorithm and PGI algorithm. As the decreasing of the number of sampling measurement, the PSNR of reconstruction image based on FGI algorithm decreases slowly, and the PSNR of reconstruction image based on PGI algorithm and CGI algorithm decreases sharply. The reconstruction time of FGI algorithm is lower than that of other algorithms and is not affected by the number of sampling measurement. The FGI algorithm can effectively filter out the random white noise through a low-pass filter and realize the reconstruction denoising which has a higher denoising capability than that of the CGI algorithm. The FGI algorithm can improve the reconstruction accuracy and the reconstruction speed of computational ghost imaging.

  10. North American vegetation model for land-use planning in a changing climate: A solution to large classification problems

    Treesearch

    Gerald E. Rehfeldt; Nicholas L. Crookston; Cuauhtemoc Saenz-Romero; Elizabeth M. Campbell

    2012-01-01

    Data points intensively sampling 46 North American biomes were used to predict the geographic distribution of biomes from climate variables using the Random Forests classification tree. Techniques were incorporated to accommodate a large number of classes and to predict the future occurrence of climates beyond the contemporary climatic range of the biomes. Errors of...

  11. A comparative study of clock rate and drift estimation

    NASA Technical Reports Server (NTRS)

    Breakiron, Lee A.

    1994-01-01

    Five different methods of drift determination and four different methods of rate determination were compared using months of hourly phase and frequency data from a sample of cesium clocks and active hydrogen masers. Linear least squares on frequency is selected as the optimal method of determining both drift and rate, more on the basis of parameter parsimony and confidence measures than on random and systematic errors.

  12. The Psychological Effect of Errors in Standardized Language Test Items on EFL Students' Responses to the Following Item

    ERIC Educational Resources Information Center

    Khaksefidi, Saman

    2017-01-01

    This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…

  13. Can a combination of average of normals and "real time" External Quality Assurance replace Internal Quality Control?

    PubMed

    Badrick, Tony; Graham, Peter

    2018-03-28

    Internal Quality Control and External Quality Assurance are separate but related processes that have developed independently in laboratory medicine over many years. They have different sample frequencies, statistical interpretations and immediacy. Both processes have evolved absorbing new understandings of the concept of laboratory error, sample material matrix and assay capability. However, we do not believe at the coalface that either process has led to much improvement in patient outcomes recently. It is the increasing reliability and automation of analytical platforms along with improved stability of reagents that has reduced systematic and random error, which in turn has minimised the risk of running less frequent IQC. We suggest that it is time to rethink the role of both these processes and unite them into a single approach using an Average of Normals model supported by more frequent External Quality Assurance samples. This new paradigm may lead to less confusion for laboratory staff and quicker responses to and identification of out of control situations.

  14. Regression dilution bias: tools for correction methods and sample size calculation.

    PubMed

    Berglund, Lars

    2012-08-01

    Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.

  15. Identifying sensitive areas of adaptive observations for prediction of the Kuroshio large meander using a shallow-water model

    NASA Astrophysics Data System (ADS)

    Zou, Guang'an; Wang, Qiang; Mu, Mu

    2016-09-01

    Sensitive areas for prediction of the Kuroshio large meander using a 1.5-layer, shallow-water ocean model were investigated using the conditional nonlinear optimal perturbation (CNOP) and first singular vector (FSV) methods. A series of sensitivity experiments were designed to test the sensitivity of sensitive areas within the numerical model. The following results were obtained: (1) the eff ect of initial CNOP and FSV patterns in their sensitive areas is greater than that of the same patterns in randomly selected areas, with the eff ect of the initial CNOP patterns in CNOP sensitive areas being the greatest; (2) both CNOP- and FSV-type initial errors grow more quickly than random errors; (3) the eff ect of random errors superimposed on the sensitive areas is greater than that of random errors introduced into randomly selected areas, and initial errors in the CNOP sensitive areas have greater eff ects on final forecasts. These results reveal that the sensitive areas determined using the CNOP are more sensitive than those of FSV and other randomly selected areas. In addition, ideal hindcasting experiments were conducted to examine the validity of the sensitive areas. The results indicate that reduction (or elimination) of CNOP-type errors in CNOP sensitive areas at the initial time has a greater forecast benefit than the reduction (or elimination) of FSV-type errors in FSV sensitive areas. These results suggest that the CNOP method is suitable for determining sensitive areas in the prediction of the Kuroshio large-meander path.

  16. Portable and Error-Free DNA-Based Data Storage.

    PubMed

    Yazdi, S M Hossein Tabatabaei; Gabrys, Ryan; Milenkovic, Olgica

    2017-07-10

    DNA-based data storage is an emerging nonvolatile memory technology of potentially unprecedented density, durability, and replication efficiency. The basic system implementation steps include synthesizing DNA strings that contain user information and subsequently retrieving them via high-throughput sequencing technologies. Existing architectures enable reading and writing but do not offer random-access and error-free data recovery from low-cost, portable devices, which is crucial for making the storage technology competitive with classical recorders. Here we show for the first time that a portable, random-access platform may be implemented in practice using nanopore sequencers. The novelty of our approach is to design an integrated processing pipeline that encodes data to avoid costly synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable sequencing via new iterative alignment and deletion error-correcting codes. Our work represents the only known random access DNA-based data storage system that uses error-prone nanopore sequencers, while still producing error-free readouts with the highest reported information rate/density. As such, it represents a crucial step towards practical employment of DNA molecules as storage media.

  17. The generalization ability of online SVM classification based on Markov sampling.

    PubMed

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  18. Gram-stain plus MALDI-TOF MS (Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry) for a rapid diagnosis of urinary tract infection.

    PubMed

    Burillo, Almudena; Rodríguez-Sánchez, Belén; Ramiro, Ana; Cercenado, Emilia; Rodríguez-Créixems, Marta; Bouza, Emilio

    2014-01-01

    Microbiological confirmation of a urinary tract infection (UTI) takes 24-48 h. In the meantime, patients are usually given empirical antibiotics, sometimes inappropriately. We assessed the feasibility of sequentially performing a Gram stain and MALDI-TOF MS mass spectrometry (MS) on urine samples to anticipate clinically useful information. In May-June 2012, we randomly selected 1000 urine samples from patients with suspected UTI. All were Gram stained and those yielding bacteria of a single morphotype were processed for MALDI-TOF MS. Our sequential algorithm was correlated with the standard semiquantitative urine culture result as follows: Match, the information provided was anticipative of culture result; Minor error, the information provided was partially anticipative of culture result; Major error, the information provided was incorrect, potentially leading to inappropriate changes in antimicrobial therapy. A positive culture was obtained in 242/1000 samples. The Gram stain revealed a single morphotype in 207 samples, which were subjected to MALDI-TOF MS. The diagnostic performance of the Gram stain was: sensitivity (Se) 81.3%, specificity (Sp) 93.2%, positive predictive value (PPV) 81.3%, negative predictive value (NPV) 93.2%, positive likelihood ratio (+LR) 11.91, negative likelihood ratio (-LR) 0.20 and accuracy 90.0% while that of MALDI-TOF MS was: Se 79.2%, Sp 73.5, +LR 2.99, -LR 0.28 and accuracy 78.3%. The use of both techniques provided information anticipative of the culture result in 82.7% of cases, information with minor errors in 13.4% and information with major errors in 3.9%. Results were available within 1 h. Our serial algorithm provided information that was consistent or showed minor errors for 96.1% of urine samples from patients with suspected UTI. The clinical impacts of this rapid UTI diagnosis strategy need to be assessed through indicators of adequacy of treatment such as a reduced time to appropriate empirical treatment or earlier withdrawal of unnecessary antibiotics.

  19. Power/Sample Size Calculations for Assessing Correlates of Risk in Clinical Efficacy Trials

    PubMed Central

    Gilbert, Peter B.; Janes, Holly E.; Huang, Yunda

    2016-01-01

    In a randomized controlled clinical trial that assesses treatment efficacy, a common objective is to assess the association of a measured biomarker response endpoint with the primary study endpoint in the active treatment group, using a case-cohort, case-control, or two-phase sampling design. Methods for power and sample size calculations for such biomarker association analyses typically do not account for the level of treatment efficacy, precluding interpretation of the biomarker association results in terms of biomarker effect modification of treatment efficacy, with detriment that the power calculations may tacitly and inadvertently assume that the treatment harms some study participants. We develop power and sample size methods accounting for this issue, and the methods also account for inter-individual variability of the biomarker that is not biologically relevant (e.g., due to technical measurement error). We focus on a binary study endpoint and on a biomarker subject to measurement error that is normally distributed or categorical with two or three levels. We illustrate the methods with preventive HIV vaccine efficacy trials, and include an R package implementing the methods. PMID:27037797

  20. Development of multiple-eye PIV using mirror array

    NASA Astrophysics Data System (ADS)

    Maekawa, Akiyoshi; Sakakibara, Jun

    2018-06-01

    In order to reduce particle image velocimetry measurement error, we manufactured an ellipsoidal polyhedral mirror and placed it between a camera and flow target to capture n images of identical particles from n (=80 maximum) different directions. The 3D particle positions were determined from the ensemble average of n C2 intersecting points of a pair of line-of-sight back-projected points from a particle found in any combination of two images in the n images. The method was then applied to a rigid-body rotating flow and a turbulent pipe flow. In the former measurement, bias error and random error fell in a range of  ±0.02 pixels and 0.02–0.05 pixels, respectively; additionally, random error decreased in proportion to . In the latter measurement, in which the measured value was compared to direct numerical simulation, bias error was reduced and random error also decreased in proportion to .

  1. Composite Interval Mapping Based on Lattice Design for Error Control May Increase Power of Quantitative Trait Locus Detection.

    PubMed

    He, Jianbo; Li, Jijie; Huang, Zhongwen; Zhao, Tuanjie; Xing, Guangnan; Gai, Junyi; Guan, Rongzhan

    2015-01-01

    Experimental error control is very important in quantitative trait locus (QTL) mapping. Although numerous statistical methods have been developed for QTL mapping, a QTL detection model based on an appropriate experimental design that emphasizes error control has not been developed. Lattice design is very suitable for experiments with large sample sizes, which is usually required for accurate mapping of quantitative traits. However, the lack of a QTL mapping method based on lattice design dictates that the arithmetic mean or adjusted mean of each line of observations in the lattice design had to be used as a response variable, resulting in low QTL detection power. As an improvement, we developed a QTL mapping method termed composite interval mapping based on lattice design (CIMLD). In the lattice design, experimental errors are decomposed into random errors and block-within-replication errors. Four levels of block-within-replication errors were simulated to show the power of QTL detection under different error controls. The simulation results showed that the arithmetic mean method, which is equivalent to a method under random complete block design (RCBD), was very sensitive to the size of the block variance and with the increase of block variance, the power of QTL detection decreased from 51.3% to 9.4%. In contrast to the RCBD method, the power of CIMLD and the adjusted mean method did not change for different block variances. The CIMLD method showed 1.2- to 7.6-fold higher power of QTL detection than the arithmetic or adjusted mean methods. Our proposed method was applied to real soybean (Glycine max) data as an example and 10 QTLs for biomass were identified that explained 65.87% of the phenotypic variation, while only three and two QTLs were identified by arithmetic and adjusted mean methods, respectively.

  2. Verifying and Validating Simulation Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hemez, Francois M.

    2015-02-23

    This presentation is a high-level discussion of the Verification and Validation (V&V) of computational models. Definitions of V&V are given to emphasize that “validation” is never performed in a vacuum; it accounts, instead, for the current state-of-knowledge in the discipline considered. In particular comparisons between physical measurements and numerical predictions should account for their respective sources of uncertainty. The differences between error (bias), aleatoric uncertainty (randomness) and epistemic uncertainty (ignorance, lack-of- knowledge) are briefly discussed. Four types of uncertainty in physics and engineering are discussed: 1) experimental variability, 2) variability and randomness, 3) numerical uncertainty and 4) model-form uncertainty. Statisticalmore » sampling methods are available to propagate, and analyze, variability and randomness. Numerical uncertainty originates from the truncation error introduced by the discretization of partial differential equations in time and space. Model-form uncertainty is introduced by assumptions often formulated to render a complex problem more tractable and amenable to modeling and simulation. The discussion concludes with high-level guidance to assess the “credibility” of numerical simulations, which stems from the level of rigor with which these various sources of uncertainty are assessed and quantified.« less

  3. Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.

    PubMed

    Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B

    2017-04-01

    Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

  4. Effects of Random Circuit Fabrication Errors on Small Signal Gain and on Output Phase In a Traveling Wave Tube

    NASA Astrophysics Data System (ADS)

    Rittersdorf, I. M.; Antonsen, T. M., Jr.; Chernin, D.; Lau, Y. Y.

    2011-10-01

    Random fabrication errors may have detrimental effects on the performance of traveling-wave tubes (TWTs) of all types. A new scaling law for the modification in the average small signal gain and in the output phase is derived from the third order ordinary differential equation that governs the forward wave interaction in a TWT in the presence of random error that is distributed along the axis of the tube. Analytical results compare favorably with numerical results, in both gain and phase modifications as a result of random error in the phase velocity of the slow wave circuit. Results on the effect of the reverse-propagating circuit mode will be reported. This work supported by AFOSR, ONR, L-3 Communications Electron Devices, and Northrop Grumman Corporation.

  5. WAMS measurements pre-processing for detecting low-frequency oscillations in power systems

    NASA Astrophysics Data System (ADS)

    Kovalenko, P. Y.

    2017-07-01

    Processing the data received from measurement systems implies the situation when one or more registered values stand apart from the sample collection. These values are referred to as “outliers”. The processing results may be influenced significantly by the presence of those in the data sample under consideration. In order to ensure the accuracy of low-frequency oscillations detection in power systems the corresponding algorithm has been developed for the outliers detection and elimination. The algorithm is based on the concept of the irregular component of measurement signal. This component comprises measurement errors and is assumed to be Gauss-distributed random. The median filtering is employed to detect the values lying outside the range of the normally distributed measurement error on the basis of a 3σ criterion. The algorithm has been validated involving simulated signals and WAMS data as well.

  6. Variability And Uncertainty Analysis Of Contaminant Transport Model Using Fuzzy Latin Hypercube Sampling Technique

    NASA Astrophysics Data System (ADS)

    Kumar, V.; Nayagum, D.; Thornton, S.; Banwart, S.; Schuhmacher2, M.; Lerner, D.

    2006-12-01

    Characterization of uncertainty associated with groundwater quality models is often of critical importance, as for example in cases where environmental models are employed in risk assessment. Insufficient data, inherent variability and estimation errors of environmental model parameters introduce uncertainty into model predictions. However, uncertainty analysis using conventional methods such as standard Monte Carlo sampling (MCS) may not be efficient, or even suitable, for complex, computationally demanding models and involving different nature of parametric variability and uncertainty. General MCS or variant of MCS such as Latin Hypercube Sampling (LHS) assumes variability and uncertainty as a single random entity and the generated samples are treated as crisp assuming vagueness as randomness. Also when the models are used as purely predictive tools, uncertainty and variability lead to the need for assessment of the plausible range of model outputs. An improved systematic variability and uncertainty analysis can provide insight into the level of confidence in model estimates, and can aid in assessing how various possible model estimates should be weighed. The present study aims to introduce, Fuzzy Latin Hypercube Sampling (FLHS), a hybrid approach of incorporating cognitive and noncognitive uncertainties. The noncognitive uncertainty such as physical randomness, statistical uncertainty due to limited information, etc can be described by its own probability density function (PDF); whereas the cognitive uncertainty such estimation error etc can be described by the membership function for its fuzziness and confidence interval by ?-cuts. An important property of this theory is its ability to merge inexact generated data of LHS approach to increase the quality of information. The FLHS technique ensures that the entire range of each variable is sampled with proper incorporation of uncertainty and variability. A fuzzified statistical summary of the model results will produce indices of sensitivity and uncertainty that relate the effects of heterogeneity and uncertainty of input variables to model predictions. The feasibility of the method is validated to assess uncertainty propagation of parameter values for estimation of the contamination level of a drinking water supply well due to transport of dissolved phenolics from a contaminated site in the UK.

  7. Methodological considerations in using complex survey data: an applied example with the Head Start Family and Child Experiences Survey.

    PubMed

    Hahs-Vaughn, Debbie L; McWayne, Christine M; Bulotsky-Shearer, Rebecca J; Wen, Xiaoli; Faria, Ann-Marie

    2011-06-01

    Complex survey data are collected by means other than simple random samples. This creates two analytical issues: nonindependence and unequal selection probability. Failing to address these issues results in underestimated standard errors and biased parameter estimates. Using data from the nationally representative Head Start Family and Child Experiences Survey (FACES; 1997 and 2000 cohorts), three diverse multilevel models are presented that illustrate differences in results depending on addressing or ignoring the complex sampling issues. Limitations of using complex survey data are reported, along with recommendations for reporting complex sample results. © The Author(s) 2011

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Novak, Erik; Trolinger, James D.; Lacey, Ian

    This work reports on the development of a binary pseudo-random test sample optimized to calibrate the MTF of optical microscopes. The sample consists of a number of 1-D and 2-D patterns, with different minimum sizes of spatial artifacts from 300 nm to 2 microns. We describe the mathematical background, fabrication process, data acquisition and analysis procedure to return spatial frequency based instrument calibration. We show that the developed samples satisfy the characteristics of a test standard: functionality, ease of specification and fabrication, reproducibility, and low sensitivity to manufacturing error. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading ofmore » the abstract is permitted for personal use only.« less

  9. At least some errors are randomly generated (Freud was wrong)

    NASA Technical Reports Server (NTRS)

    Sellen, A. J.; Senders, J. W.

    1986-01-01

    An experiment was carried out to expose something about human error generating mechanisms. In the context of the experiment, an error was made when a subject pressed the wrong key on a computer keyboard or pressed no key at all in the time allotted. These might be considered, respectively, errors of substitution and errors of omission. Each of seven subjects saw a sequence of three digital numbers, made an easily learned binary judgement about each, and was to press the appropriate one of two keys. Each session consisted of 1,000 presentations of randomly permuted, fixed numbers broken into 10 blocks of 100. One of two keys should have been pressed within one second of the onset of each stimulus. These data were subjected to statistical analyses in order to probe the nature of the error generating mechanisms. Goodness of fit tests for a Poisson distribution for the number of errors per 50 trial interval and for an exponential distribution of the length of the intervals between errors were carried out. There is evidence for an endogenous mechanism that may best be described as a random error generator. Furthermore, an item analysis of the number of errors produced per stimulus suggests the existence of a second mechanism operating on task driven factors producing exogenous errors. Some errors, at least, are the result of constant probability generating mechanisms with error rate idiosyncratically determined for each subject.

  10. On the Calculation of Uncertainty Statistics with Error Bounds for CFD Calculations Containing Random Parameters and Fields

    NASA Technical Reports Server (NTRS)

    Barth, Timothy J.

    2016-01-01

    This chapter discusses the ongoing development of combined uncertainty and error bound estimates for computational fluid dynamics (CFD) calculations subject to imposed random parameters and random fields. An objective of this work is the construction of computable error bound formulas for output uncertainty statistics that guide CFD practitioners in systematically determining how accurately CFD realizations should be approximated and how accurately uncertainty statistics should be approximated for output quantities of interest. Formal error bounds formulas for moment statistics that properly account for the presence of numerical errors in CFD calculations and numerical quadrature errors in the calculation of moment statistics have been previously presented in [8]. In this past work, hierarchical node-nested dense and sparse tensor product quadratures are used to calculate moment statistics integrals. In the present work, a framework has been developed that exploits the hierarchical structure of these quadratures in order to simplify the calculation of an estimate of the quadrature error needed in error bound formulas. When signed estimates of realization error are available, this signed error may also be used to estimate output quantity of interest probability densities as a means to assess the impact of realization error on these density estimates. Numerical results are presented for CFD problems with uncertainty to demonstrate the capabilities of this framework.

  11. A model-based 'varimax' sampling strategy for a heterogeneous population.

    PubMed

    Akram, Nuzhat A; Farooqi, Shakeel R

    2014-01-01

    Sampling strategies are planned to enhance the homogeneity of a sample, hence to minimize confounding errors. A sampling strategy was developed to minimize the variation within population groups. Karachi, the largest urban agglomeration in Pakistan, was used as a model population. Blood groups ABO and Rh factor were determined for 3000 unrelated individuals selected through simple random sampling. Among them five population groups, namely Balochi, Muhajir, Pathan, Punjabi and Sindhi, based on paternal ethnicity were identified. An index was designed to measure the proportion of admixture at parental and grandparental levels. Population models based on index score were proposed. For validation, 175 individuals selected through stratified random sampling were genotyped for the three STR loci CSF1PO, TPOX and TH01. ANOVA showed significant differences across the population groups for blood groups and STR loci distribution. Gene diversity was higher across the sub-population model than in the agglomerated population. At parental level gene diversities are significantly higher across No admixture models than Admixture models. At grandparental level the difference was not significant. A sub-population model with no admixture at parental level was justified for sampling the heterogeneous population of Karachi.

  12. Systematic evaluation of NASA precipitation radar estimates using NOAA/NSSL National Mosaic QPE products

    NASA Astrophysics Data System (ADS)

    Kirstetter, P.; Hong, Y.; Gourley, J. J.; Chen, S.; Flamig, Z.; Zhang, J.; Howard, K.; Petersen, W. A.

    2011-12-01

    Proper characterization of the error structure of TRMM Precipitation Radar (PR) quantitative precipitation estimation (QPE) is needed for their use in TRMM combined products, water budget studies and hydrological modeling applications. Due to the variety of sources of error in spaceborne radar QPE (attenuation of the radar signal, influence of land surface, impact of off-nadir viewing angle, etc.) and the impact of correction algorithms, the problem is addressed by comparison of PR QPEs with reference values derived from ground-based measurements (GV) using NOAA/NSSL's National Mosaic QPE (NMQ) system. An investigation of this subject has been carried out at the PR estimation scale (instantaneous and 5 km) on the basis of a 3-month-long data sample. A significant effort has been carried out to derive a bias-corrected, robust reference rainfall source from NMQ. The GV processing details will be presented along with preliminary results of PR's error characteristics using contingency table statistics, probability distribution comparisons, scatter plots, semi-variograms, and systematic biases and random errors.

  13. Using Audit Information to Adjust Parameter Estimates for Data Errors in Clinical Trials

    PubMed Central

    Shepherd, Bryan E.; Shaw, Pamela A.; Dodd, Lori E.

    2013-01-01

    Background Audits are often performed to assess the quality of clinical trial data, but beyond detecting fraud or sloppiness, the audit data is generally ignored. In earlier work using data from a non-randomized study, Shepherd and Yu (2011) developed statistical methods to incorporate audit results into study estimates, and demonstrated that audit data could be used to eliminate bias. Purpose In this manuscript we examine the usefulness of audit-based error-correction methods in clinical trial settings where a continuous outcome is of primary interest. Methods We demonstrate the bias of multiple linear regression estimates in general settings with an outcome that may have errors and a set of covariates for which some may have errors and others, including treatment assignment, are recorded correctly for all subjects. We study this bias under different assumptions including independence between treatment assignment, covariates, and data errors (conceivable in a double-blinded randomized trial) and independence between treatment assignment and covariates but not data errors (possible in an unblinded randomized trial). We review moment-based estimators to incorporate the audit data and propose new multiple imputation estimators. The performance of estimators is studied in simulations. Results When treatment is randomized and unrelated to data errors, estimates of the treatment effect using the original error-prone data (i.e., ignoring the audit results) are unbiased. In this setting, both moment and multiple imputation estimators incorporating audit data are more variable than standard analyses using the original data. In contrast, in settings where treatment is randomized but correlated with data errors and in settings where treatment is not randomized, standard treatment effect estimates will be biased. And in all settings, parameter estimates for the original, error-prone covariates will be biased. Treatment and covariate effect estimates can be corrected by incorporating audit data using either the multiple imputation or moment-based approaches. Bias, precision, and coverage of confidence intervals improve as the audit size increases. Limitations The extent of bias and the performance of methods depend on the extent and nature of the error as well as the size of the audit. This work only considers methods for the linear model. Settings much different than those considered here need further study. Conclusions In randomized trials with continuous outcomes and treatment assignment independent of data errors, standard analyses of treatment effects will be unbiased and are recommended. However, if treatment assignment is correlated with data errors or other covariates, naive analyses may be biased. In these settings, and when covariate effects are of interest, approaches for incorporating audit results should be considered. PMID:22848072

  14. [Study of spatial stratified sampling strategy of Oncomelania hupensis snail survey based on plant abundance].

    PubMed

    Xun-Ping, W; An, Z

    2017-07-27

    Objective To optimize and simplify the survey method of Oncomelania hupensis snails in marshland endemic regions of schistosomiasis, so as to improve the precision, efficiency and economy of the snail survey. Methods A snail sampling strategy (Spatial Sampling Scenario of Oncomelania based on Plant Abundance, SOPA) which took the plant abundance as auxiliary variable was explored and an experimental study in a 50 m×50 m plot in a marshland in the Poyang Lake region was performed. Firstly, the push broom surveyed data was stratified into 5 layers by the plant abundance data; then, the required numbers of optimal sampling points of each layer through Hammond McCullagh equation were calculated; thirdly, every sample point in the line with the Multiple Directional Interpolation (MDI) placement scheme was pinpointed; and finally, the comparison study among the outcomes of the spatial random sampling strategy, the traditional systematic sampling method, the spatial stratified sampling method, Sandwich spatial sampling and inference and SOPA was performed. Results The method (SOPA) proposed in this study had the minimal absolute error of 0.213 8; and the traditional systematic sampling method had the largest estimate, and the absolute error was 0.924 4. Conclusion The snail sampling strategy (SOPA) proposed in this study obtains the higher estimation accuracy than the other four methods.

  15. Robustness of reliable change indices to variability in Parkinson's disease with mild cognitive impairment.

    PubMed

    Turner, T H; Renfroe, J B; Elm, J; Duppstadt-Delambo, A; Hinson, V K

    2016-01-01

    Ability to identify change is crucial for measuring response to interventions and tracking disease progression. Beyond psychometrics, investigations of Parkinson's disease with mild cognitive impairment (PD-MCI) must consider fluctuating medication, motor, and mental status. One solution is to employ 90% reliable change indices (RCIs) from test manuals to account for account measurement error and practice effects. The current study examined robustness of 90% RCIs for 19 commonly used executive function tests in 14 PD-MCI subjects assigned to the placebo arm of a 10-week randomized controlled trial of atomoxetine in PD-MCI. Using 90% RCIs, the typical participant showed spurious improvement on one measure, and spurious decline on another. Reliability estimates from healthy adults standardization samples and PD-MCI were similar. In contrast to healthy adult samples, practice effects were minimal in this PD-MCI group. Separate 90% RCIs based on the PD-MCI sample did not further reduce error rate. In the present study, application of 90% RCIs based on healthy adults in standardization samples effectively reduced misidentification of change in a sample of PD-MCI. Our findings support continued application of 90% RCIs when using executive function tests to assess change in neurological populations with fluctuating status.

  16. Small Sample Performance of Bias-corrected Sandwich Estimators for Cluster-Randomized Trials with Binary Outcomes

    PubMed Central

    Li, Peng; Redden, David T.

    2014-01-01

    SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738

  17. Empirically Calibrated Asteroseismic Masses and Radii for Red Giants in the Kepler Fields

    NASA Astrophysics Data System (ADS)

    Pinsonneault, Marc; Elsworth, Yvonne; Silva Aguirre, Victor; Chaplin, William J.; Garcia, Rafael A.; Hekker, Saskia; Holtzman, Jon; Huber, Daniel; Johnson, Jennifer; Kallinger, Thomas; Mosser, Benoit; Mathur, Savita; Serenelli, Aldo; Shetrone, Matthew; Stello, Dennis; Tayar, Jamie; Zinn, Joel; APOGEE Team, KASC Team, APOKASC Team

    2018-01-01

    We report on the joint asteroseismic and spectroscopic properties of a sample of 6048 evolved stars in the fields originally observed by the Kepler satellite. We use APOGEE spectroscopic data taken from Data Release 13 of the Sloan Digital Sky Survey, combined with asteroseismic data analyzed by members of the Kepler Asteroseismic Science Consortium. With high statistical significance, the different pipelines do not have relative zero points that are the same as the solar values, and red clump stars do not have the same empirical relative zero points as red giants. We employ theoretically motivated corrections to the scaling relation for the large frequency spacing, and adjust the zero point of the frequency of maximum power scaling relation to be consistent with masses and radii for members of star clusters. The scatter in calibrator masses is consistent with our error estimation. Systematic and random mass errors are explicitly separated and identified. The measurement scatter, and random uncertainties, are three times larger for red giants where one or more technique failed to return a value than for targets where all five methods could do so, and this is a substantial fraction of the sample (20% of red giants and 25% of red clump stars). Overall trends and future prospects are discussed.

  18. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes

    PubMed Central

    Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  19. Field evaluation of the error arising from inadequate time averaging in the standard use of depth-integrating suspended-sediment samplers

    USGS Publications Warehouse

    Topping, David J.; Rubin, David M.; Wright, Scott A.; Melis, Theodore S.

    2011-01-01

    Several common methods for measuring suspended-sediment concentration in rivers in the United States use depth-integrating samplers to collect a velocity-weighted suspended-sediment sample in a subsample of a river cross section. Because depth-integrating samplers are always moving through the water column as they collect a sample, and can collect only a limited volume of water and suspended sediment, they collect only minimally time-averaged data. Four sources of error exist in the field use of these samplers: (1) bed contamination, (2) pressure-driven inrush, (3) inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration, and (4) inadequate time averaging. The first two of these errors arise from misuse of suspended-sediment samplers, and the third has been the subject of previous study using data collected in the sand-bedded Middle Loup River in Nebraska. Of these four sources of error, the least understood source of error arises from the fact that depth-integrating samplers collect only minimally time-averaged data. To evaluate this fourth source of error, we collected suspended-sediment data between 1995 and 2007 at four sites on the Colorado River in Utah and Arizona, using a P-61 suspended-sediment sampler deployed in both point- and one-way depth-integrating modes, and D-96-A1 and D-77 bag-type depth-integrating suspended-sediment samplers. These data indicate that the minimal duration of time averaging during standard field operation of depth-integrating samplers leads to an error that is comparable in magnitude to that arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration. This random error arising from inadequate time averaging is positively correlated with grain size and does not largely depend on flow conditions or, for a given size class of suspended sediment, on elevation above the bed. Averaging over time scales >1 minute is the likely minimum duration required to result in substantial decreases in this error. During standard two-way depth integration, a depth-integrating suspended-sediment sampler collects a sample of the water-sediment mixture during two transits at each vertical in a cross section: one transit while moving from the water surface to the bed, and another transit while moving from the bed to the water surface. As the number of transits is doubled at an individual vertical, this error is reduced by ~30 percent in each size class of suspended sediment. For a given size class of suspended sediment, the error arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration depends only on the number of verticals collected, whereas the error arising from inadequate time averaging depends on both the number of verticals collected and the number of transits collected at each vertical. Summing these two errors in quadrature yields a total uncertainty in an equal-discharge-increment (EDI) or equal-width-increment (EWI) measurement of the time-averaged velocity-weighted suspended-sediment concentration in a river cross section (exclusive of any laboratory-processing errors). By virtue of how the number of verticals and transits influences the two individual errors within this total uncertainty, the error arising from inadequate time averaging slightly dominates that arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration. Adding verticals to an EDI or EWI measurement is slightly more effective in reducing the total uncertainty than adding transits only at each vertical, because a new vertical contributes both temporal and spatial information. However, because collection of depth-integrated samples at more transits at each vertical is generally easier and faster than at more verticals, addition of a combination of verticals and transits is likely a more practical approach to reducing the total uncertainty in most field situatio

  20. [Qualitative evaluation of blood products records in a hospital].

    PubMed

    Lartigue, B; Catillon, E

    2012-02-01

    This study aimed at evaluating the qualitative performance of blood products traceability from paper and electronic medical records in a hospital. Quality of date/time documentation was assessed by detection, for 20minutes or more, of chronological errors and inter-source inconsistencies, in a random sample of 168 blood products transfused during 2009. A receipt date/time was confirmed in 52% of paper records; a data entry error was attested in 25% of paper records, and 21% of electronic records. A transfusion date/time was notified in 93% of paper records, with a data entry error in 26% of paper records and 25% of electronic records. The patient medical record held at least one date/time error in 18% and 17%, for receipt and transfusion respectively. Environmental factors (clinical setting, urgency, blood product category) did not contributed to data error rates. Although blood products traceability has good quantitative results, the recorded documentation is not qualitative. In our study, data entry errors are similar in electronic or paper records, but the global failure rate is lesser in electronic records because omissions are controlled. Copyright © 2011 Elsevier Masson SAS. All rights reserved.

  1. Nonparametric probability density estimation by optimization theoretic techniques

    NASA Technical Reports Server (NTRS)

    Scott, D. W.

    1976-01-01

    Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.

  2. Simultaneous Laser Ranging and Communication from an Earth-Based Satellite Laser Ranging Station to the Lunar Reconnaissance Orbiter in Lunar Orbit

    NASA Technical Reports Server (NTRS)

    Sun, Xiaoli; Skillman, David R.; Hoffman, Evan D.; Mao, Dandan; McGarry, Jan F.; Neumann, Gregory A.; McIntire, Leva; Zellar, Ronald S.; Davidson, Frederic M.; Fong, Wai H.; hide

    2013-01-01

    We report a free space laser communication experiment from the satellite laser ranging (SLR) station at NASA Goddard Space Flight Center (GSFC) to the Lunar Reconnaissance Orbiter (LRO) in lunar orbit through the on board one-way Laser Ranging (LR) receiver. Pseudo random data and sample image files were transmitted to LRO using a 4096-ary pulse position modulation (PPM) signal format. Reed-Solomon forward error correction codes were used to achieve error free data transmission at a moderate coding overhead rate. The signal fading due to the atmosphere effect was measured and the coding gain could be estimated.

  3. Random Error in Judgment: The Contribution of Encoding and Retrieval Processes

    ERIC Educational Resources Information Center

    Pleskac, Timothy J.; Dougherty, Michael R.; Rivadeneira, A. Walkyria; Wallsten, Thomas S.

    2009-01-01

    Theories of confidence judgments have embraced the role random error plays in influencing responses. An important next step is to identify the source(s) of these random effects. To do so, we used the stochastic judgment model (SJM) to distinguish the contribution of encoding and retrieval processes. In particular, we investigated whether dividing…

  4. The random coding bound is tight for the average code.

    NASA Technical Reports Server (NTRS)

    Gallager, R. G.

    1973-01-01

    The random coding bound of information theory provides a well-known upper bound to the probability of decoding error for the best code of a given rate and block length. The bound is constructed by upperbounding the average error probability over an ensemble of codes. The bound is known to give the correct exponential dependence of error probability on block length for transmission rates above the critical rate, but it gives an incorrect exponential dependence at rates below a second lower critical rate. Here we derive an asymptotic expression for the average error probability over the ensemble of codes used in the random coding bound. The result shows that the weakness of the random coding bound at rates below the second critical rate is due not to upperbounding the ensemble average, but rather to the fact that the best codes are much better than the average at low rates.

  5. Self-reference and random sampling approach for label-free identification of DNA composition using plasmonic nanomaterials.

    PubMed

    Freeman, Lindsay M; Pang, Lin; Fainman, Yeshaiahu

    2018-05-09

    The analysis of DNA has led to revolutionary advancements in the fields of medical diagnostics, genomics, prenatal screening, and forensic science, with the global DNA testing market expected to reach revenues of USD 10.04 billion per year by 2020. However, the current methods for DNA analysis remain dependent on the necessity for fluorophores or conjugated proteins, leading to high costs associated with consumable materials and manual labor. Here, we demonstrate a potential label-free DNA composition detection method using surface-enhanced Raman spectroscopy (SERS) in which we identify the composition of cytosine and adenine within single strands of DNA. This approach depends on the fact that there is one phosphate backbone per nucleotide, which we use as a reference to compensate for systematic measurement variations. We utilize plasmonic nanomaterials with random Raman sampling to perform label-free detection of the nucleotide composition within DNA strands, generating a calibration curve from standard samples of DNA and demonstrating the capability of resolving the nucleotide composition. The work represents an innovative way for detection of the DNA composition within DNA strands without the necessity of attached labels, offering a highly sensitive and reproducible method that factors in random sampling to minimize error.

  6. Sampling considerations for disease surveillance in wildlife populations

    USGS Publications Warehouse

    Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

    2008-01-01

    Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.

  7. Calculating radiotherapy margins based on Bayesian modelling of patient specific random errors

    NASA Astrophysics Data System (ADS)

    Herschtal, A.; te Marvelde, L.; Mengersen, K.; Hosseinifard, Z.; Foroudi, F.; Devereux, T.; Pham, D.; Ball, D.; Greer, P. B.; Pichler, P.; Eade, T.; Kneebone, A.; Bell, L.; Caine, H.; Hindson, B.; Kron, T.

    2015-02-01

    Collected real-life clinical target volume (CTV) displacement data show that some patients undergoing external beam radiotherapy (EBRT) demonstrate significantly more fraction-to-fraction variability in their displacement (‘random error’) than others. This contrasts with the common assumption made by historical recipes for margin estimation for EBRT, that the random error is constant across patients. In this work we present statistical models of CTV displacements in which random errors are characterised by an inverse gamma (IG) distribution in order to assess the impact of random error variability on CTV-to-PTV margin widths, for eight real world patient cohorts from four institutions, and for different sites of malignancy. We considered a variety of clinical treatment requirements and penumbral widths. The eight cohorts consisted of a total of 874 patients and 27 391 treatment sessions. Compared to a traditional margin recipe that assumes constant random errors across patients, for a typical 4 mm penumbral width, the IG based margin model mandates that in order to satisfy the common clinical requirement that 90% of patients receive at least 95% of prescribed RT dose to the entire CTV, margins be increased by a median of 10% (range over the eight cohorts -19% to +35%). This substantially reduces the proportion of patients for whom margins are too small to satisfy clinical requirements.

  8. Population-based survey of refractive error among school-aged children in rural northern China: the Heilongjiang eye study.

    PubMed

    Li, Zhijian; Xu, Keke; Wu, Shubin; Lv, Jia; Jin, Di; Song, Zhen; Wang, Zhongliang; Liu, Ping

    2014-01-01

    The prevalence of refractive error in the north of China is unknown. The study aimed to estimate the prevalence and associated factors of refractive error in school-aged children in a rural area of northern China. Cross-sectional study. The cluster random sampling method was used to select the sample. A total of 1700 subjects of 5 to 18 years of age were examined. All participants underwent ophthalmic evaluation. Refraction was performed under cycloplegia. Association of refractive errors with age, sex, and education was analysed. The main outcome measure was prevalence rates of refractive error among school-aged children. Of the 1700 responders, 1675 were eligible. The prevalence of uncorrected, presenting, and best-corrected visual acuity of 20/40 or worse in the better eye was 6.3%, 3.0% and 1.2%, respectively. The prevalence of myopia was 5.0% (84/1675, 95% CI, 4.8%-5.4%) and of hyperopia was 1.6% (27/1675, 95% CI, 1.0%-2.2%). Astigmatism was evident in 2.0% of the subjects. Myopia increased with increasing age, whereas hyperopia and astigmatism were associated with younger age. Myopia, hyperopia and astigmatism were more common in females. We also found that prevalence of refractive error were associated with education. Myopia and astigmatism were more common in those with higher degrees of education. This report has provided details of the refractive status in a rural school-aged population. Although the prevalence of refractive errors is lower in the population, the unmet need for spectacle correction remains a significant challenge for refractive eye-care services. © 2013 Royal Australian and New Zealand College of Ophthalmologists.

  9. Standardising analysis of carbon monoxide rebreathing for application in anti-doping.

    PubMed

    Alexander, Anthony C; Garvican, Laura A; Burge, Caroline M; Clark, Sally A; Plowman, James S; Gore, Christopher J

    2011-03-01

    Determination of total haemoglobin mass (Hbmass) via carbon monoxide (CO) depends critically on repeatable measurement of percent carboxyhaemoglobin (%HbCO) in blood with a hemoximeter. The main aim of this study was to determine, for an OSM3 hemoximeter, the number of replicate measures as well as the theoretical change in percent carboxyhaemoglobin required to yield a random error of analysis (Analyser Error) of ≤1%. Before and after inhalation of CO, nine participants provided a total of 576 blood samples that were each analysed five times for percent carboxyhaemoglobin on one of three OSM3 hemoximeters; with approximately one-third of blood samples analysed on each OSM3. The Analyser Error was calculated for the first two (duplicate), first three (triplicate) and first four (quadruplicate) measures on each OSM3, as well as for all five measures (quintuplicates). Two methods of CO-rebreathing, a 2-min and 10-min procedure, were evaluated for Analyser Error. For duplicate analyses of blood, the Analyser Error for the 2-min method was 3.7, 4.0 and 5.0% for the three OSM3s when the percent carboxyhaemoglobin increased by two above resting values. With quintuplicate analyses of blood, the corresponding errors reduced to .8, .9 and 1.0% for the 2-min method when the percent carboxyhaemoglobin increased by 5.5 above resting values. In summary, to minimise the Analyser Error to ∼≤1% on an OSM3 hemoximeter, researchers should make ≥5 replicates of percent carboxyhaemoglobin and the volume of CO administered should be sufficient increase percent carboxyhaemoglobin by ≥5.5 above baseline levels. Crown Copyright © 2010. Published by Elsevier Ltd. All rights reserved.

  10. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    PubMed Central

    Matochko, Wadim L.; Derda, Ratmir

    2013-01-01

    Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||ni||, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a IN, where IN is a N × N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. PMID:24416071

  11. Particle Tracking on the BNL Relativistic Heavy Ion Collider

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dell, G. F.

    1986-08-07

    Tracking studies including the effects of random multipole errors as well as the effects of random and systematic multipole errors have been made for RHIC. Initial results for operating at an off diagonal working point are discussed.

  12. Generalized SAMPLE SIZE Determination Formulas for Investigating Contextual Effects by a Three-Level Random Intercept Model.

    PubMed

    Usami, Satoshi

    2017-03-01

    Behavioral and psychological researchers have shown strong interests in investigating contextual effects (i.e., the influences of combinations of individual- and group-level predictors on individual-level outcomes). The present research provides generalized formulas for determining the sample size needed in investigating contextual effects according to the desired level of statistical power as well as width of confidence interval. These formulas are derived within a three-level random intercept model that includes one predictor/contextual variable at each level to simultaneously cover various kinds of contextual effects that researchers can show interest. The relative influences of indices included in the formulas on the standard errors of contextual effects estimates are investigated with the aim of further simplifying sample size determination procedures. In addition, simulation studies are performed to investigate finite sample behavior of calculated statistical power, showing that estimated sample sizes based on derived formulas can be both positively and negatively biased due to complex effects of unreliability of contextual variables, multicollinearity, and violation of assumption regarding the known variances. Thus, it is advisable to compare estimated sample sizes under various specifications of indices and to evaluate its potential bias, as illustrated in the example.

  13. Compressive Sampling based Image Coding for Resource-deficient Visual Communication.

    PubMed

    Liu, Xianming; Zhai, Deming; Zhou, Jiantao; Zhang, Xinfeng; Zhao, Debin; Gao, Wen

    2016-04-14

    In this paper, a new compressive sampling based image coding scheme is developed to achieve competitive coding efficiency at lower encoder computational complexity, while supporting error resilience. This technique is particularly suitable for visual communication with resource-deficient devices. At the encoder, compact image representation is produced, which is a polyphase down-sampled version of the input image; but the conventional low-pass filter prior to down-sampling is replaced by a local random binary convolution kernel. The pixels of the resulting down-sampled pre-filtered image are local random measurements and placed in the original spatial configuration. The advantages of local random measurements are two folds: 1) preserve high-frequency image features that are otherwise discarded by low-pass filtering; 2) remain a conventional image and can therefore be coded by any standardized codec to remove statistical redundancy of larger scales. Moreover, measurements generated by different kernels can be considered as multiple descriptions of the original image and therefore the proposed scheme has the advantage of multiple description coding. At the decoder, a unified sparsity-based soft-decoding technique is developed to recover the original image from received measurements in a framework of compressive sensing. Experimental results demonstrate that the proposed scheme is competitive compared with existing methods, with a unique strength of recovering fine details and sharp edges at low bit-rates.

  14. Simulation of wave propagation in three-dimensional random media

    NASA Technical Reports Server (NTRS)

    Coles, William A.; Filice, J. P.; Frehlich, R. G.; Yadlowsky, M.

    1993-01-01

    Quantitative error analysis for simulation of wave propagation in three dimensional random media assuming narrow angular scattering are presented for the plane wave and spherical wave geometry. This includes the errors resulting from finite grid size, finite simulation dimensions, and the separation of the two-dimensional screens along the propagation direction. Simple error scalings are determined for power-law spectra of the random refractive index of the media. The effects of a finite inner scale are also considered. The spatial spectra of the intensity errors are calculated and compared to the spatial spectra of intensity. The numerical requirements for a simulation of given accuracy are determined for realizations of the field. The numerical requirements for accurate estimation of higher moments of the field are less stringent.

  15. Regionalized PM2.5 Community Multiscale Air Quality model performance evaluation across a continuous spatiotemporal domain.

    PubMed

    Reyes, Jeanette M; Xu, Yadong; Vizuete, William; Serre, Marc L

    2017-01-01

    The regulatory Community Multiscale Air Quality (CMAQ) model is a means to understanding the sources, concentrations and regulatory attainment of air pollutants within a model's domain. Substantial resources are allocated to the evaluation of model performance. The Regionalized Air quality Model Performance (RAMP) method introduced here explores novel ways of visualizing and evaluating CMAQ model performance and errors for daily Particulate Matter ≤ 2.5 micrometers (PM2.5) concentrations across the continental United States. The RAMP method performs a non-homogenous, non-linear, non-homoscedastic model performance evaluation at each CMAQ grid. This work demonstrates that CMAQ model performance, for a well-documented 2001 regulatory episode, is non-homogeneous across space/time. The RAMP correction of systematic errors outperforms other model evaluation methods as demonstrated by a 22.1% reduction in Mean Square Error compared to a constant domain wide correction. The RAMP method is able to accurately reproduce simulated performance with a correlation of r = 76.1%. Most of the error coming from CMAQ is random error with only a minority of error being systematic. Areas of high systematic error are collocated with areas of high random error, implying both error types originate from similar sources. Therefore, addressing underlying causes of systematic error will have the added benefit of also addressing underlying causes of random error.

  16. Estimating the State of Aerodynamic Flows in the Presence of Modeling Errors

    NASA Astrophysics Data System (ADS)

    da Silva, Andre F. C.; Colonius, Tim

    2017-11-01

    The ensemble Kalman filter (EnKF) has been proven to be successful in fields such as meteorology, in which high-dimensional nonlinear systems render classical estimation techniques impractical. When the model used to forecast state evolution misrepresents important aspects of the true dynamics, estimator performance may degrade. In this work, parametrization and state augmentation are used to track misspecified boundary conditions (e.g., free stream perturbations). The resolution error is modeled as a Gaussian-distributed random variable with the mean (bias) and variance to be determined. The dynamics of the flow past a NACA 0009 airfoil at high angles of attack and moderate Reynolds number is represented by a Navier-Stokes equations solver with immersed boundaries capabilities. The pressure distribution on the airfoil or the velocity field in the wake, both randomized by synthetic noise, are sampled as measurement data and incorporated into the estimated state and bias following Kalman's analysis scheme. Insights about how to specify the modeling error covariance matrix and its impact on the estimator performance are conveyed. This work has been supported in part by a Grant from AFOSR (FA9550-14-1-0328) with Dr. Douglas Smith as program manager, and by a Science without Borders scholarship from the Ministry of Education of Brazil (Capes Foundation - BEX 12966/13-4).

  17. Assessing the Relationship of Ancient and Modern Populations

    PubMed Central

    Schraiber, Joshua G.

    2018-01-01

    Genetic material sequenced from ancient samples is revolutionizing our understanding of the recent evolutionary past. However, ancient DNA is often degraded, resulting in low coverage, error-prone sequencing. Several solutions exist to this problem, ranging from simple approach, such as selecting a read at random for each site, to more complicated approaches involving genotype likelihoods. In this work, we present a novel method for assessing the relationship of an ancient sample with a modern population, while accounting for sequencing error and postmortem damage by analyzing raw reads from multiple ancient individuals simultaneously. We show that, when analyzing SNP data, it is better to sequence more ancient samples to low coverage: two samples sequenced to 0.5× coverage provide better resolution than a single sample sequenced to 2× coverage. We also examined the power to detect whether an ancient sample is directly ancestral to a modern population, finding that, with even a few high coverage individuals, even ancient samples that are very slightly diverged from the modern population can be detected with ease. When we applied our approach to European samples, we found that no ancient samples represent direct ancestors of modern Europeans. We also found that, as shown previously, the most ancient Europeans appear to have had the smallest effective population sizes, indicating a role for agriculture in modern population growth. PMID:29167200

  18. ICP-Forests (International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests): Quality Assurance procedure in plant diversity monitoring.

    PubMed

    Allegrini, Maria-Cristina; Canullo, Roberto; Campetella, Giandiego

    2009-04-01

    Knowledge of accuracy and precision rates is particularly important for long-term studies. Vegetation assessments include many sources of error related to overlooking and misidentification, that are usually influenced by some factors, such as cover estimate subjectivity, observer biased species lists and experience of the botanist. The vegetation assessment protocol adopted in the Italian forest monitoring programme (CONECOFOR) contains a Quality Assurance programme. The paper presents the different phases of QA, separates the 5 main critical points of the whole protocol as sources of random or systematic errors. Examples of Measurement Quality Objectives (MQOs) expressed as Data Quality Limits (DQLs) are given for vascular plant cover estimates, in order to establish the reproducibility of the data. Quality control activities were used to determine the "distance" between the surveyor teams and the control team. Selected data were acquired during the training and inter-calibration courses. In particular, an index of average cover by species groups was used to evaluate the random error (CV 4%) as the dispersion around the "true values" of the control team. The systematic error in the evaluation of species composition, caused by overlooking or misidentification of species, was calculated following the pseudo-turnover rate; detailed species censuses on smaller sampling units were accepted as the pseudo-turnover which always fell below the 25% established threshold; species density scores recorded at community level (100 m(2) surface) rarely exceeded that limit.

  19. Blessing of dimensionality: mathematical foundations of the statistical physics of data.

    PubMed

    Gorban, A N; Tyukin, I Y

    2018-04-28

    The concentrations of measure phenomena were discovered as the mathematical background to statistical mechanics at the end of the nineteenth/beginning of the twentieth century and have been explored in mathematics ever since. At the beginning of the twenty-first century, it became clear that the proper utilization of these phenomena in machine learning might transform the curse of dimensionality into the blessing of dimensionality This paper summarizes recently discovered phenomena of measure concentration which drastically simplify some machine learning problems in high dimension, and allow us to correct legacy artificial intelligence systems. The classical concentration of measure theorems state that i.i.d. random points are concentrated in a thin layer near a surface (a sphere or equators of a sphere, an average or median-level set of energy or another Lipschitz function, etc.). The new stochastic separation theorems describe the thin structure of these thin layers: the random points are not only concentrated in a thin layer but are all linearly separable from the rest of the set, even for exponentially large random sets. The linear functionals for separation of points can be selected in the form of the linear Fisher's discriminant. All artificial intelligence systems make errors. Non-destructive correction requires separation of the situations (samples) with errors from the samples corresponding to correct behaviour by a simple and robust classifier. The stochastic separation theorems provide us with such classifiers and determine a non-iterative (one-shot) procedure for their construction.This article is part of the theme issue 'Hilbert's sixth problem'. © 2018 The Author(s).

  20. Blessing of dimensionality: mathematical foundations of the statistical physics of data

    NASA Astrophysics Data System (ADS)

    Gorban, A. N.; Tyukin, I. Y.

    2018-04-01

    The concentrations of measure phenomena were discovered as the mathematical background to statistical mechanics at the end of the nineteenth/beginning of the twentieth century and have been explored in mathematics ever since. At the beginning of the twenty-first century, it became clear that the proper utilization of these phenomena in machine learning might transform the curse of dimensionality into the blessing of dimensionality. This paper summarizes recently discovered phenomena of measure concentration which drastically simplify some machine learning problems in high dimension, and allow us to correct legacy artificial intelligence systems. The classical concentration of measure theorems state that i.i.d. random points are concentrated in a thin layer near a surface (a sphere or equators of a sphere, an average or median-level set of energy or another Lipschitz function, etc.). The new stochastic separation theorems describe the thin structure of these thin layers: the random points are not only concentrated in a thin layer but are all linearly separable from the rest of the set, even for exponentially large random sets. The linear functionals for separation of points can be selected in the form of the linear Fisher's discriminant. All artificial intelligence systems make errors. Non-destructive correction requires separation of the situations (samples) with errors from the samples corresponding to correct behaviour by a simple and robust classifier. The stochastic separation theorems provide us with such classifiers and determine a non-iterative (one-shot) procedure for their construction. This article is part of the theme issue `Hilbert's sixth problem'.

  1. Accuracy of Robotic Radiosurgical Liver Treatment Throughout the Respiratory Cycle

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Winter, Jeff D.; Wong, Raimond; Swaminath, Anand

    Purpose: To quantify random uncertainties in robotic radiosurgical treatment of liver lesions with real-time respiratory motion management. Methods and Materials: We conducted a retrospective analysis of 27 liver cancer patients treated with robotic radiosurgery over 118 fractions. The robotic radiosurgical system uses orthogonal x-ray images to determine internal target position and correlates this position with an external surrogate to provide robotic corrections of linear accelerator positioning. Verification and update of this internal–external correlation model was achieved using periodic x-ray images collected throughout treatment. To quantify random uncertainties in targeting, we analyzed logged tracking information and isolated x-ray images collected immediately beforemore » beam delivery. For translational correlation errors, we quantified the difference between correlation model–estimated target position and actual position determined by periodic x-ray imaging. To quantify prediction errors, we computed the mean absolute difference between the predicted coordinates and actual modeled position calculated 115 milliseconds later. We estimated overall random uncertainty by quadratically summing correlation, prediction, and end-to-end targeting errors. We also investigated relationships between tracking errors and motion amplitude using linear regression. Results: The 95th percentile absolute correlation errors in each direction were 2.1 mm left–right, 1.8 mm anterior–posterior, 3.3 mm cranio–caudal, and 3.9 mm 3-dimensional radial, whereas 95th percentile absolute radial prediction errors were 0.5 mm. Overall 95th percentile random uncertainty was 4 mm in the radial direction. Prediction errors were strongly correlated with modeled target amplitude (r=0.53-0.66, P<.001), whereas only weak correlations existed for correlation errors. Conclusions: Study results demonstrate that model correlation errors are the primary random source of uncertainty in Cyberknife liver treatment and, unlike prediction errors, are not strongly correlated with target motion amplitude. Aggregate 3-dimensional radial position errors presented here suggest the target will be within 4 mm of the target volume for 95% of the beam delivery.« less

  2. Effects of gustatory stimulants of salivary secretion on salivary pH and flow: a randomized controlled trial.

    PubMed

    da Mata, A D S P; da Silva Marques, D N; Silveira, J M L; Marques, J R O F; de Melo Campos Felino, E T; Guilherme, N F R P M

    2009-04-01

    To compare salivary pH changes and stimulation efficacy of two different gustatory stimulants of salivary secretion (GSSS). Portuguese Dental Faculty Clinic. Double blind randomized controlled trial. One hundred and twenty volunteers were randomized to two intervention groups. Sample sized was calculated using an alpha error of 0.05 and a beta of 0.20. Participants were randomly assigned to receive a new gustatory stimulant of secretory secretion containing a weaker malic acid, fluoride and xylitol or a traditionally citric acid-based one. Saliva collection was obtained by established methods at different times. The salivary pH of the samples was determined with a pH meter and a microelectrode. Salivary pH variations and counts of subjects with pH below 5.5 for over 1 min and stimulated salivary flow were the main outcome measures. Both GSSS significantly stimulated salivary output without significant differences between the two groups. The new gustatory stimulant of salivary secretion presented a risk reduction of 80 +/- 10.6% (95% CI) when compared with the traditional one. Gustatory stimulants of salivary secretion with fluoride, xylitol and lower acid content maintain similar salivary stimulation capacity while reducing significantly the dental erosion predictive potential.

  3. Evaluating Precipitation from Orbital Data Products of TRMM and GPM over the Indian Subcontinent

    NASA Astrophysics Data System (ADS)

    Jayaluxmi, I.; Kumar, D. N.

    2015-12-01

    The rapidly growing records of microwave based precipitation data made available from various earth observation satellites have instigated a pressing need towards evaluating the associated uncertainty which arise from different sources such as retrieval error, spatial/temporal sampling error and sensor dependent error. Pertaining to microwave remote sensing, most of the studies in literature focus on gridded data products, fewer studies exist on evaluating the uncertainty inherent in orbital data products. Evaluation of the latter are essential as they potentially cause large uncertainties during real time flood forecasting studies especially at the watershed scale. The present study evaluates the uncertainty of precipitation data derived from the orbital data products of the Tropical Rainfall Measuring Mission (TRMM) satellite namely the 2A12, 2A25 and 2B31 products. Case study results over the flood prone basin of Mahanadi, India, are analyzed for precipitation uncertainty through these three facets viz., a) Uncertainty quantification using the volumetric metrics from the contingency table [Aghakouchak and Mehran 2014] b) Error characterization using additive and multiplicative error models c) Error decomposition to identify systematic and random errors d) Comparative assessment with the orbital data from GPM mission. The homoscedastic random errors from multiplicative error models justify a better representation of precipitation estimates by the 2A12 algorithm. It can be concluded that although the radiometer derived 2A12 precipitation data is known to suffer from many sources of uncertainties, spatial analysis over the case study region of India testifies that they are in excellent agreement with the reference estimates for the data period considered [Indu and Kumar 2015]. References A. AghaKouchak and A. Mehran (2014), Extended contingency table: Performance metrics for satellite observations and climate model simulations, Water Resources Research, vol. 49, 7144-7149; J. Indu and D. Nagesh Kumar (2015), Evaluation of Precipitation Retrievals from Orbital Data Products of TRMM over a Subtropical basin in India, IEEE Transactions on Geoscience and Remote Sensing, in press, doi: 10.1109/TGRS.2015.2440338.

  4. Patterns of technical error among surgical malpractice claims: an analysis of strategies to prevent injury to surgical patients.

    PubMed

    Regenbogen, Scott E; Greenberg, Caprice C; Studdert, David M; Lipsitz, Stuart R; Zinner, Michael J; Gawande, Atul A

    2007-11-01

    To identify the most prevalent patterns of technical errors in surgery, and evaluate commonly recommended interventions in light of these patterns. The majority of surgical adverse events involve technical errors, but little is known about the nature and causes of these events. We examined characteristics of technical errors and common contributing factors among closed surgical malpractice claims. Surgeon reviewers analyzed 444 randomly sampled surgical malpractice claims from four liability insurers. Among 258 claims in which injuries due to error were detected, 52% (n = 133) involved technical errors. These technical errors were further analyzed with a structured review instrument designed by qualitative content analysis. Forty-nine percent of the technical errors caused permanent disability; an additional 16% resulted in death. Two-thirds (65%) of the technical errors were linked to manual error, 9% to errors in judgment, and 26% to both manual and judgment error. A minority of technical errors involved advanced procedures requiring special training ("index operations"; 16%), surgeons inexperienced with the task (14%), or poorly supervised residents (9%). The majority involved experienced surgeons (73%), and occurred in routine, rather than index, operations (84%). Patient-related complexities-including emergencies, difficult or unexpected anatomy, and previous surgery-contributed to 61% of technical errors, and technology or systems failures contributed to 21%. Most technical errors occur in routine operations with experienced surgeons, under conditions of increased patient complexity or systems failure. Commonly recommended interventions, including restricting high-complexity operations to experienced surgeons, additional training for inexperienced surgeons, and stricter supervision of trainees, are likely to address only a minority of technical errors. Surgical safety research should instead focus on improving decision-making and performance in routine operations for complex patients and circumstances.

  5. Confidence intervals for a difference between lognormal means in cluster randomization trials.

    PubMed

    Poirier, Julia; Zou, G Y; Koval, John

    2017-04-01

    Cluster randomization trials, in which intact social units are randomized to different interventions, have become popular in the last 25 years. Outcomes from these trials in many cases are positively skewed, following approximately lognormal distributions. When inference is focused on the difference between treatment arm arithmetic means, existent confidence interval procedures either make restricting assumptions or are complex to implement. We approach this problem by assuming log-transformed outcomes from each treatment arm follow a one-way random effects model. The treatment arm means are functions of multiple parameters for which separate confidence intervals are readily available, suggesting that the method of variance estimates recovery may be applied to obtain closed-form confidence intervals. A simulation study showed that this simple approach performs well in small sample sizes in terms of empirical coverage, relatively balanced tail errors, and interval widths as compared to existing methods. The methods are illustrated using data arising from a cluster randomization trial investigating a critical pathway for the treatment of community acquired pneumonia.

  6. Gossip and Distributed Kalman Filtering: Weak Consensus Under Weak Detectability

    NASA Astrophysics Data System (ADS)

    Kar, Soummya; Moura, José M. F.

    2011-04-01

    The paper presents the gossip interactive Kalman filter (GIKF) for distributed Kalman filtering for networked systems and sensor networks, where inter-sensor communication and observations occur at the same time-scale. The communication among sensors is random; each sensor occasionally exchanges its filtering state information with a neighbor depending on the availability of the appropriate network link. We show that under a weak distributed detectability condition: 1. the GIKF error process remains stochastically bounded, irrespective of the instability properties of the random process dynamics; and 2. the network achieves \\emph{weak consensus}, i.e., the conditional estimation error covariance at a (uniformly) randomly selected sensor converges in distribution to a unique invariant measure on the space of positive semi-definite matrices (independent of the initial state.) To prove these results, we interpret the filtered states (estimates and error covariances) at each node in the GIKF as stochastic particles with local interactions. We analyze the asymptotic properties of the error process by studying as a random dynamical system the associated switched (random) Riccati equation, the switching being dictated by a non-stationary Markov chain on the network graph.

  7. Designing a national soil erosion monitoring network for England and Wales

    NASA Astrophysics Data System (ADS)

    Lark, Murray; Rawlins, Barry; Anderson, Karen; Evans, Martin; Farrow, Luke; Glendell, Miriam; James, Mike; Rickson, Jane; Quine, Timothy; Quinton, John; Brazier, Richard

    2014-05-01

    Although soil erosion is recognised as a significant threat to sustainable land use and may be a priority for action in any forthcoming EU Soil Framework Directive, those responsible for setting national policy with respect to erosion are constrained by a lack of robust, representative, data at large spatial scales. This reflects the process-orientated nature of much soil erosion research. Recognising this limitation, The UK Department for Environment, Food and Rural Affairs (Defra) established a project to pilot a cost-effective framework for monitoring of soil erosion in England and Wales (E&W). The pilot will compare different soil erosion monitoring methods at a site scale and provide statistical information for the final design of the full national monitoring network that will: provide unbiased estimates of the spatial mean of soil erosion rate across E&W (tonnes ha-1 yr-1) for each of three land-use classes - arable and horticultural grassland upland and semi-natural habitats quantify the uncertainty of these estimates with confidence intervals. Probability (design-based) sampling provides most efficient unbiased estimates of spatial means. In this study, a 16 hectare area (a square of 400 x 400 m) positioned at the centre of a 1-km grid cell, selected at random from mapped land use across E&W, provided the sampling support for measurement of erosion rates, with at least 94% of the support area corresponding to the target land use classes. Very small or zero erosion rates likely to be encountered at many sites reduce the sampling efficiency and make it difficult to compare different methods of soil erosion monitoring. Therefore, to increase the proportion of samples with larger erosion rates without biasing our estimates, we increased the inclusion probability density in areas where the erosion rate is likely to be large by using stratified random sampling. First, each sampling domain (land use class in E&W) was divided into strata; e.g. two sub-domains within which, respectively, small or no erosion rates, and moderate or larger erosion rates are expected. Each stratum was then sampled independently and at random. The sample density need not be equal in the two strata, but is known and is accounted for in the estimation of the mean and its standard error. To divide the domains into strata we used information on slope angle, previous interpretation of erosion susceptibility of the soil associations that correspond to the soil map of E&W at 1:250 000 (Soil Survey of England and Wales, 1983), and visual interpretation of evidence of erosion from aerial photography. While each domain could be stratified on the basis of the first two criteria, air photo interpretation across the whole country was not feasible. For this reason we used a two-phase random sampling for stratification (TPRS) design (de Gruijter et al., 2006). First, we formed an initial random sample of 1-km grid cells from the target domain. Second, each cell was then allocated to a stratum on the basis of the three criteria. A subset of the selected cells from each stratum were then selected for field survey at random, with a specified sampling density for each stratum so as to increase the proportion of cells where moderate or larger erosion rates were expected. Once measurements of erosion have been made, an estimate of the spatial mean of the erosion rate over the target domain, its standard error and associated uncertainty can be calculated by an expression which accounts for the estimated proportions of the two strata within the initial random sample. de Gruijter, J.J., Brus, D.J., Biekens, M.F.P. & Knotters, M. 2006. Sampling for Natural Resource Monitoring. Springer, Berlin. Soil Survey of England and Wales. 1983 National Soil Map NATMAP Vector 1:250,000. National Soil Research Institute, Cranfield University.

  8. What errors do peer reviewers detect, and does training improve their ability to detect them?

    PubMed

    Schroter, Sara; Black, Nick; Evans, Stephen; Godlee, Fiona; Osorio, Lyda; Smith, Richard

    2008-10-01

    To analyse data from a trial and report the frequencies with which major and minor errors are detected at a general medical journal, the types of errors missed and the impact of training on error detection. 607 peer reviewers at the BMJ were randomized to two intervention groups receiving different types of training (face-to-face training or a self-taught package) and a control group. Each reviewer was sent the same three test papers over the study period, each of which had nine major and five minor methodological errors inserted. BMJ peer reviewers. The quality of review, assessed using a validated instrument, and the number and type of errors detected before and after training. The number of major errors detected varied over the three papers. The interventions had small effects. At baseline (Paper 1) reviewers found an average of 2.58 of the nine major errors, with no notable difference between the groups. The mean number of errors reported was similar for the second and third papers, 2.71 and 3.0, respectively. Biased randomization was the error detected most frequently in all three papers, with over 60% of reviewers rejecting the papers identifying this error. Reviewers who did not reject the papers found fewer errors and the proportion finding biased randomization was less than 40% for each paper. Editors should not assume that reviewers will detect most major errors, particularly those concerned with the context of study. Short training packages have only a slight impact on improving error detection.

  9. Statistical modeling of interfractional tissue deformation and its application in radiation therapy planning

    NASA Astrophysics Data System (ADS)

    Vile, Douglas J.

    In radiation therapy, interfraction organ motion introduces a level of geometric uncertainty into the planning process. Plans, which are typically based upon a single instance of anatomy, must be robust against daily anatomical variations. For this problem, a model of the magnitude, direction, and likelihood of deformation is useful. In this thesis, principal component analysis (PCA) is used to statistically model the 3D organ motion for 19 prostate cancer patients, each with 8-13 fractional computed tomography (CT) images. Deformable image registration and the resultant displacement vector fields (DVFs) are used to quantify the interfraction systematic and random motion. By applying the PCA technique to the random DVFs, principal modes of random tissue deformation were determined for each patient, and a method for sampling synthetic random DVFs was developed. The PCA model was then extended to describe the principal modes of systematic and random organ motion for the population of patients. A leave-one-out study tested both the systematic and random motion model's ability to represent PCA training set DVFs. The random and systematic DVF PCA models allowed the reconstruction of these data with absolute mean errors between 0.5-0.9 mm and 1-2 mm, respectively. To the best of the author's knowledge, this study is the first successful effort to build a fully 3D statistical PCA model of systematic tissue deformation in a population of patients. By sampling synthetic systematic and random errors, organ occupancy maps were created for bony and prostate-centroid patient setup processes. By thresholding these maps, PCA-based planning target volume (PTV) was created and tested against conventional margin recipes (van Herk for bony alignment and 5 mm fixed [3 mm posterior] margin for centroid alignment) in a virtual clinical trial for low-risk prostate cancer. Deformably accumulated delivered dose served as a surrogate for clinical outcome. For the bony landmark setup subtrial, the PCA PTV significantly (p<0.05) reduced D30, D20, and D5 to bladder and D50 to rectum, while increasing rectal D20 and D5. For the centroid-aligned setup, the PCA PTV significantly reduced all bladder DVH metrics and trended to lower rectal toxicity metrics. All PTVs covered the prostate with the prescription dose.

  10. An analytic technique for statistically modeling random atomic clock errors in estimation

    NASA Technical Reports Server (NTRS)

    Fell, P. J.

    1981-01-01

    Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.

  11. Sweat Sodium Concentration: Inter-Unit Variability of a Low Cost, Portable, and Battery Operated Sodium Analyzer.

    PubMed

    Goulet, Eric D B; Baker, Lindsay B

    2017-12-01

    The B-722 Laqua Twin is a low cost, portable, and battery operated sodium analyzer, which can be used for the assessment of sweat sodium concentration. The Laqua Twin is reliable and provides a degree of accuracy similar to more expensive analyzers; however, its interunit measurement error remains unknown. The purpose of this study was to compare the sodium concentration values of 70 sweat samples measured using three different Laqua Twin units. Mean absolute errors, random errors and constant errors among the different Laqua Twins ranged respectively between 1.7 mmol/L to 3.5 mmol/L, 2.5 mmol/L to 3.7 mmol/L and -0.6 mmol/L to 3.9 mmol/L. Proportional errors among Laqua Twins were all < 2%. Based on a within-subject biological variability in sweat sodium concentration of ± 12%, the maximal allowable imprecision among instruments was considered to be £ 6%. In that respect, the within (2.9%), between (4.5%), and total (5.4%) measurement error coefficient of variations were all < 6%. For a given sweat sodium concentration value, the largest observed difference in mean and lower and upper bound error of measurements among instruments were, respectively, 4.7 mmol/L, 2.3 mmol/L, and 7.0 mmol/L. In conclusion, our findings show that the interunit measurement error of the B-722 Laqua Twin is low and methodologically acceptable.

  12. Refractive Error in a Sample of Black High School Children in South Africa.

    PubMed

    Wajuihian, Samuel Otabor; Hansraj, Rekha

    2017-12-01

    This study focused on a cohort that has not been studied and who currently have limited access to eye care services. The findings, while improving the understanding of the distribution of refractive errors, also enabled identification of children requiring intervention and provided a guide for future resource allocation. The aim of conducting the study was to determine the prevalence and distribution of refractive error and its association with gender, age, and school grade level. Using a multistage random cluster sampling, 1586 children, 632 males (40%) and 954 females (60%), were selected. Their ages ranged between 13 and 18 years with a mean of 15.81 ± 1.56 years. The visual functions evaluated included visual acuity using the logarithm of minimum angle of resolution chart and refractive error measured using the autorefractor and then refined subjectively. Axis astigmatism was presented in the vector method where positive values of J0 indicated with-the-rule astigmatism, negative values indicated against-the-rule astigmatism, whereas J45 represented oblique astigmatism. Overall, patients were myopic with a mean spherical power for right eye of -0.02 ± 0.47; mean astigmatic cylinder power was -0.09 ± 0.27 with mainly with-the-rule astigmatism (J0 = 0.01 ± 0.11). The prevalence estimates were as follows: myopia (at least -0.50) 7% (95% confidence interval [CI], 6 to 9%), hyperopia (at least 0.5) 5% (95% CI, 4 to 6%), astigmatism (at least -0.75 cylinder) 3% (95% CI, 2 to 4%), and anisometropia 3% (95% CI, 2 to 4%). There was no significant association between refractive error and any of the categories (gender, age, and grade levels). The prevalence of refractive error in the sample of high school children was relatively low. Myopia was the most prevalent, and findings on its association with age suggest that the prevalence of myopia may be stabilizing at late teenage years.

  13. Sampled-data H∞ filtering for Markovian jump singularly perturbed systems with time-varying delay and missing measurements

    NASA Astrophysics Data System (ADS)

    Yan, Yifang; Yang, Chunyu; Ma, Xiaoping; Zhou, Linna

    2018-02-01

    In this paper, sampled-data H∞ filtering problem is considered for Markovian jump singularly perturbed systems with time-varying delay and missing measurements. The sampled-data system is represented by a time-delay system, and the missing measurement phenomenon is described by an independent Bernoulli random process. By constructing an ɛ-dependent stochastic Lyapunov-Krasovskii functional, delay-dependent sufficient conditions are derived such that the filter error system satisfies the prescribed H∞ performance for all possible missing measurements. Then, an H∞ filter design method is proposed in terms of linear matrix inequalities. Finally, numerical examples are given to illustrate the feasibility and advantages of the obtained results.

  14. The inference of vector magnetic fields from polarization measurements with limited spectral resolution

    NASA Technical Reports Server (NTRS)

    Lites, B. W.; Skumanich, A.

    1985-01-01

    A method is presented for recovery of the vector magnetic field and thermodynamic parameters from polarization measurement of photospheric line profiles measured with filtergraphs. The method includes magneto-optic effects and may be utilized on data sampled at arbitrary wavelengths within the line profile. The accuracy of this method is explored through inversion of synthetic Stokes profiles subjected to varying levels of random noise, instrumental wave-length resolution, and line profile sampling. The level of error introduced by the systematic effect of profile sampling over a finite fraction of the 5 minute oscillation cycle is also investigated. The results presented here are intended to guide instrumental design and observational procedure.

  15. A study of digital holographic filters generation. Phase 2: Digital data communication system, volume 1

    NASA Technical Reports Server (NTRS)

    Ingels, F. M.; Mo, C. D.

    1978-01-01

    An empirical study of the performance of the Viterbi decoders in bursty channels was carried out and an improved algebraic decoder for nonsystematic codes was developed. The hybrid algorithm was simulated for the (2,1), k = 7 code on a computer using 20 channels having various error statistics, ranging from pure random error to pure bursty channels. The hybrid system outperformed both the algebraic and the Viterbi decoders in every case, except the 1% random error channel where the Viterbi decoder had one bit less decoding error.

  16. Error threshold for color codes and random three-body Ising models.

    PubMed

    Katzgraber, Helmut G; Bombin, H; Martin-Delgado, M A

    2009-08-28

    We study the error threshold of color codes, a class of topological quantum codes that allow a direct implementation of quantum Clifford gates suitable for entanglement distillation, teleportation, and fault-tolerant quantum computation. We map the error-correction process onto a statistical mechanical random three-body Ising model and study its phase diagram via Monte Carlo simulations. The obtained error threshold of p(c) = 0.109(2) is very close to that of Kitaev's toric code, showing that enhanced computational capabilities do not necessarily imply lower resistance to noise.

  17. Effect of random errors in planar PIV data on pressure estimation in vortex dominated flows

    NASA Astrophysics Data System (ADS)

    McClure, Jeffrey; Yarusevych, Serhiy

    2015-11-01

    The sensitivity of pressure estimation techniques from Particle Image Velocimetry (PIV) measurements to random errors in measured velocity data is investigated using the flow over a circular cylinder as a test case. Direct numerical simulations are performed for ReD = 100, 300 and 1575, spanning laminar, transitional, and turbulent wake regimes, respectively. A range of random errors typical for PIV measurements is applied to synthetic PIV data extracted from numerical results. A parametric study is then performed using a number of common pressure estimation techniques. Optimal temporal and spatial resolutions are derived based on the sensitivity of the estimated pressure fields to the simulated random error in velocity measurements, and the results are compared to an optimization model derived from error propagation theory. It is shown that the reductions in spatial and temporal scales at higher Reynolds numbers leads to notable changes in the optimal pressure evaluation parameters. The effect of smaller scale wake structures is also quantified. The errors in the estimated pressure fields are shown to depend significantly on the pressure estimation technique employed. The results are used to provide recommendations for the use of pressure and force estimation techniques from experimental PIV measurements in vortex dominated laminar and turbulent wake flows.

  18. Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kleijnen, J.P.C.; Helton, J.C.

    1999-04-01

    The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (1) linear relationships with correlation coefficients, (2) monotonic relationships with rank correlation coefficients, (3) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (4) trends in variability as defined by variances and interquartile ranges, and (5) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are consideredmore » for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (1) Type I errors are unavoidable, (2) Type II errors can occur when inappropriate analysis procedures are used, (3) physical explanations should always be sought for why statistical procedures identify variables as being important, and (4) the identification of important variables tends to be stable for independent Latin hypercube samples.« less

  19. Detecting and preventing error propagation via competitive learning.

    PubMed

    Silva, Thiago Christiano; Zhao, Liang

    2013-05-01

    Semisupervised learning is a machine learning approach which is able to employ both labeled and unlabeled samples in the training process. It is an important mechanism for autonomous systems due to the ability of exploiting the already acquired information and for exploring the new knowledge in the learning space at the same time. In these cases, the reliability of the labels is a crucial factor, because mislabeled samples may propagate wrong labels to a portion of or even the entire data set. This paper has the objective of addressing the error propagation problem originated by these mislabeled samples by presenting a mechanism embedded in a network-based (graph-based) semisupervised learning method. Such a procedure is based on a combined random-preferential walk of particles in a network constructed from the input data set. The particles of the same class cooperate among them, while the particles of different classes compete with each other to propagate class labels to the whole network. Computer simulations conducted on synthetic and real-world data sets reveal the effectiveness of the model. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Random Measurement Error as a Source of Discrepancies between the Reports of Wives and Husbands Concerning Marital Power and Task Allocation.

    ERIC Educational Resources Information Center

    Quarm, Daisy

    1981-01-01

    Findings for couples (N=119) show wife's work, money, and spare time low between-spouse correlations are due in part to random measurement error. Suggests that increasing reliability of measures by creating multi-item indices can also increase correlations. Car purchase, vacation, and child discipline were not accounted for by random measurement…

  1. A Robust Bayesian Random Effects Model for Nonlinear Calibration Problems

    PubMed Central

    Fong, Y.; Wakefield, J.; De Rosa, S.; Frahm, N.

    2013-01-01

    Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this paper, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a re-parameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods. PMID:22551415

  2. Modeling Errors in Daily Precipitation Measurements: Additive or Multiplicative?

    NASA Technical Reports Server (NTRS)

    Tian, Yudong; Huffman, George J.; Adler, Robert F.; Tang, Ling; Sapiano, Matthew; Maggioni, Viviana; Wu, Huan

    2013-01-01

    The definition and quantification of uncertainty depend on the error model used. For uncertainties in precipitation measurements, two types of error models have been widely adopted: the additive error model and the multiplicative error model. This leads to incompatible specifications of uncertainties and impedes intercomparison and application.In this letter, we assess the suitability of both models for satellite-based daily precipitation measurements in an effort to clarify the uncertainty representation. Three criteria were employed to evaluate the applicability of either model: (1) better separation of the systematic and random errors; (2) applicability to the large range of variability in daily precipitation; and (3) better predictive skills. It is found that the multiplicative error model is a much better choice under all three criteria. It extracted the systematic errors more cleanly, was more consistent with the large variability of precipitation measurements, and produced superior predictions of the error characteristics. The additive error model had several weaknesses, such as non constant variance resulting from systematic errors leaking into random errors, and the lack of prediction capability. Therefore, the multiplicative error model is a better choice.

  3. One-step random mutagenesis by error-prone rolling circle amplification

    PubMed Central

    Fujii, Ryota; Kitaoka, Motomitsu; Hayashi, Kiyoshi

    2004-01-01

    In vitro random mutagenesis is a powerful tool for altering properties of enzymes. We describe here a novel random mutagenesis method using rolling circle amplification, named error-prone RCA. This method consists of only one DNA amplification step followed by transformation of the host strain, without treatment with any restriction enzymes or DNA ligases, and results in a randomly mutated plasmid library with 3–4 mutations per kilobase. Specific primers or special equipment, such as a thermal-cycler, are not required. This method permits rapid preparation of randomly mutated plasmid libraries, enabling random mutagenesis to become a more commonly used technique. PMID:15507684

  4. [Failure mode and effects analysis on computerized drug prescriptions].

    PubMed

    Paredes-Atenciano, J A; Roldán-Aviña, J P; González-García, Mercedes; Blanco-Sánchez, M C; Pinto-Melero, M A; Pérez-Ramírez, C; Calvo Rubio-Burgos, Miguel; Osuna-Navarro, F J; Jurado-Carmona, A M

    2015-01-01

    To identify and analyze errors in drug prescriptions of patients treated in a "high resolution" hospital by applying a Failure mode and effects analysis (FMEA).Material and methods A multidisciplinary group of medical specialties and nursing analyzed medical records where drug prescriptions were held in free text format. An FMEA was developed in which the risk priority index (RPI) was obtained from a cross-sectional observational study using an audit of the medical records, carried out in 2 phases: 1) Pre-intervention testing, and (2) evaluation of improvement actions after the first analysis. An audit sample size of 679 medical records from a total of 2,096 patients was calculated using stratified sampling and random selection of clinical events. Prescription errors decreased by 22.2% in the second phase. FMEA showed a greater RPI in "unspecified route of administration" and "dosage unspecified", with no significant decreases observed in the second phase, although it did detect, "incorrect dosing time", "contraindication due to drug allergy", "wrong patient" or "duplicate prescription", which resulted in the improvement of prescriptions. Drug prescription errors have been identified and analyzed by FMEA methodology, improving the clinical safety of these prescriptions. This tool allows updates of electronic prescribing to be monitored. To avoid such errors would require the mandatory completion of all sections of a prescription. Copyright © 2014 SECA. Published by Elsevier Espana. All rights reserved.

  5. Metabolite and transcript markers for the prediction of potato drought tolerance.

    PubMed

    Sprenger, Heike; Erban, Alexander; Seddig, Sylvia; Rudack, Katharina; Thalhammer, Anja; Le, Mai Q; Walther, Dirk; Zuther, Ellen; Köhl, Karin I; Kopka, Joachim; Hincha, Dirk K

    2018-04-01

    Potato (Solanum tuberosum L.) is one of the most important food crops worldwide. Current potato varieties are highly susceptible to drought stress. In view of global climate change, selection of cultivars with improved drought tolerance and high yield potential is of paramount importance. Drought tolerance breeding of potato is currently based on direct selection according to yield and phenotypic traits and requires multiple trials under drought conditions. Marker-assisted selection (MAS) is cheaper, faster and reduces classification errors caused by noncontrolled environmental effects. We analysed 31 potato cultivars grown under optimal and reduced water supply in six independent field trials. Drought tolerance was determined as tuber starch yield. Leaf samples from young plants were screened for preselected transcript and nontargeted metabolite abundance using qRT-PCR and GC-MS profiling, respectively. Transcript marker candidates were selected from a published RNA-Seq data set. A Random Forest machine learning approach extracted metabolite and transcript markers for drought tolerance prediction with low error rates of 6% and 9%, respectively. Moreover, by combining transcript and metabolite markers, the prediction error was reduced to 4.3%. Feature selection from Random Forest models allowed model minimization, yielding a minimal combination of only 20 metabolite and transcript markers that were successfully tested for their reproducibility in 16 independent agronomic field trials. We demonstrate that a minimum combination of transcript and metabolite markers sampled at early cultivation stages predicts potato yield stability under drought largely independent of seasonal and regional agronomic conditions. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  6. The effects of transcutaneous electrical nerve stimulation on joint position sense in patients with knee joint osteoarthritis.

    PubMed

    Shirazi, Zahra Rojhani; Shafaee, Razieh; Abbasi, Leila

    2014-10-01

    To study the effects of transcutaneous electrical nerve stimulation (TENS) on joint position sense (JPS) in knee osteoarthritis (OA) subjects. Thirty subjects with knee OA (40-60 years old) using non-random sampling participated in this study. In order to evaluate the absolute error of repositioning of the knee joint, Qualysis Track Manager system was used and sensory electrical stimulation was applied through the TENS device. The mean errors in repositioning of the joint, in two position of the knee joint with 20 and 60 degree angle, after applying the TENS was significantly decreased (p < 0.05). Application of TENS in subjects with knee OA could improve JPS in these subjects.

  7. Claims, errors, and compensation payments in medical malpractice litigation.

    PubMed

    Studdert, David M; Mello, Michelle M; Gawande, Atul A; Gandhi, Tejal K; Kachalia, Allen; Yoon, Catherine; Puopolo, Ann Louise; Brennan, Troyen A

    2006-05-11

    In the current debate over tort reform, critics of the medical malpractice system charge that frivolous litigation--claims that lack evidence of injury, substandard care, or both--is common and costly. Trained physicians reviewed a random sample of 1452 closed malpractice claims from five liability insurers to determine whether a medical injury had occurred and, if so, whether it was due to medical error. We analyzed the prevalence, characteristics, litigation outcomes, and costs of claims that lacked evidence of error. For 3 percent of the claims, there were no verifiable medical injuries, and 37 percent did not involve errors. Most of the claims that were not associated with errors (370 of 515 [72 percent]) or injuries (31 of 37 [84 percent]) did not result in compensation; most that involved injuries due to error did (653 of 889 [73 percent]). Payment of claims not involving errors occurred less frequently than did the converse form of inaccuracy--nonpayment of claims associated with errors. When claims not involving errors were compensated, payments were significantly lower on average than were payments for claims involving errors (313,205 dollars vs. 521,560 dollars, P=0.004). Overall, claims not involving errors accounted for 13 to 16 percent of the system's total monetary costs. For every dollar spent on compensation, 54 cents went to administrative expenses (including those involving lawyers, experts, and courts). Claims involving errors accounted for 78 percent of total administrative costs. Claims that lack evidence of error are not uncommon, but most are denied compensation. The vast majority of expenditures go toward litigation over errors and payment of them. The overhead costs of malpractice litigation are exorbitant. Copyright 2006 Massachusetts Medical Society.

  8. Incidence of speech recognition errors in the emergency department.

    PubMed

    Goss, Foster R; Zhou, Li; Weiner, Scott G

    2016-09-01

    Physician use of computerized speech recognition (SR) technology has risen in recent years due to its ease of use and efficiency at the point of care. However, error rates between 10 and 23% have been observed, raising concern about the number of errors being entered into the permanent medical record, their impact on quality of care and medical liability that may arise. Our aim was to determine the incidence and types of SR errors introduced by this technology in the emergency department (ED). Level 1 emergency department with 42,000 visits/year in a tertiary academic teaching hospital. A random sample of 100 notes dictated by attending emergency physicians (EPs) using SR software was collected from the ED electronic health record between January and June 2012. Two board-certified EPs annotated the notes and conducted error analysis independently. An existing classification schema was adopted to classify errors into eight errors types. Critical errors deemed to potentially impact patient care were identified. There were 128 errors in total or 1.3 errors per note, and 14.8% (n=19) errors were judged to be critical. 71% of notes contained errors, and 15% contained one or more critical errors. Annunciation errors were the highest at 53.9% (n=69), followed by deletions at 18.0% (n=23) and added words at 11.7% (n=15). Nonsense errors, homonyms and spelling errors were present in 10.9% (n=14), 4.7% (n=6), and 0.8% (n=1) of notes, respectively. There were no suffix or dictionary errors. Inter-annotator agreement was 97.8%. This is the first estimate at classifying speech recognition errors in dictated emergency department notes. Speech recognition errors occur commonly with annunciation errors being the most frequent. Error rates were comparable if not lower than previous studies. 15% of errors were deemed critical, potentially leading to miscommunication that could affect patient care. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  9. Stochastic goal-oriented error estimation with memory

    NASA Astrophysics Data System (ADS)

    Ackmann, Jan; Marotzke, Jochem; Korn, Peter

    2017-11-01

    We propose a stochastic dual-weighted error estimator for the viscous shallow-water equation with boundaries. For this purpose, previous work on memory-less stochastic dual-weighted error estimation is extended by incorporating memory effects. The memory is introduced by describing the local truncation error as a sum of time-correlated random variables. The random variables itself represent the temporal fluctuations in local truncation errors and are estimated from high-resolution information at near-initial times. The resulting error estimator is evaluated experimentally in two classical ocean-type experiments, the Munk gyre and the flow around an island. In these experiments, the stochastic process is adapted locally to the respective dynamical flow regime. Our stochastic dual-weighted error estimator is shown to provide meaningful error bounds for a range of physically relevant goals. We prove, as well as show numerically, that our approach can be interpreted as a linearized stochastic-physics ensemble.

  10. Kernel Wiener filter and its application to pattern recognition.

    PubMed

    Yoshino, Hirokazu; Dong, Chen; Washizawa, Yoshikazu; Yamashita, Yukihiko

    2010-11-01

    The Wiener filter (WF) is widely used for inverse problems. From an observed signal, it provides the best estimated signal with respect to the squared error averaged over the original and the observed signals among linear operators. The kernel WF (KWF), extended directly from WF, has a problem that an additive noise has to be handled by samples. Since the computational complexity of kernel methods depends on the number of samples, a huge computational cost is necessary for the case. By using the first-order approximation of kernel functions, we realize KWF that can handle such a noise not by samples but as a random variable. We also propose the error estimation method for kernel filters by using the approximations. In order to show the advantages of the proposed methods, we conducted the experiments to denoise images and estimate errors. We also apply KWF to classification since KWF can provide an approximated result of the maximum a posteriori classifier that provides the best recognition accuracy. The noise term in the criterion can be used for the classification in the presence of noise or a new regularization to suppress changes in the input space, whereas the ordinary regularization for the kernel method suppresses changes in the feature space. In order to show the advantages of the proposed methods, we conducted experiments of binary and multiclass classifications and classification in the presence of noise.

  11. Investigation of spectral analysis techniques for randomly sampled velocimetry data

    NASA Technical Reports Server (NTRS)

    Sree, Dave

    1993-01-01

    It is well known that velocimetry (LV) generates individual realization velocity data that are randomly or unevenly sampled in time. Spectral analysis of such data to obtain the turbulence spectra, and hence turbulence scales information, requires special techniques. The 'slotting' technique of Mayo et al, also described by Roberts and Ajmani, and the 'Direct Transform' method of Gaster and Roberts are well known in the LV community. The slotting technique is faster than the direct transform method in computation. There are practical limitations, however, as to how a high frequency and accurate estimate can be made for a given mean sampling rate. These high frequency estimates are important in obtaining the microscale information of turbulence structure. It was found from previous studies that reliable spectral estimates can be made up to about the mean sampling frequency (mean data rate) or less. If the data were evenly samples, the frequency range would be half the sampling frequency (i.e. up to Nyquist frequency); otherwise, aliasing problem would occur. The mean data rate and the sample size (total number of points) basically limit the frequency range. Also, there are large variabilities or errors associated with the high frequency estimates from randomly sampled signals. Roberts and Ajmani proposed certain pre-filtering techniques to reduce these variabilities, but at the cost of low frequency estimates. The prefiltering acts as a high-pass filter. Further, Shapiro and Silverman showed theoretically that, for Poisson sampled signals, it is possible to obtain alias-free spectral estimates far beyond the mean sampling frequency. But the question is, how far? During his tenure under 1993 NASA-ASEE Summer Faculty Fellowship Program, the author investigated from his studies on the spectral analysis techniques for randomly sampled signals that the spectral estimates can be enhanced or improved up to about 4-5 times the mean sampling frequency by using a suitable prefiltering technique. But, this increased bandwidth comes at the cost of the lower frequency estimates. The studies further showed that large data sets of the order of 100,000 points, or more, high data rates, and Poisson sampling are very crucial for obtaining reliable spectral estimates from randomly sampled data, such as LV data. Some of the results of the current study are presented.

  12. Scalable randomized benchmarking of non-Clifford gates

    NASA Astrophysics Data System (ADS)

    Cross, Andrew; Magesan, Easwar; Bishop, Lev; Smolin, John; Gambetta, Jay

    Randomized benchmarking is a widely used experimental technique to characterize the average error of quantum operations. Benchmarking procedures that scale to enable characterization of n-qubit circuits rely on efficient procedures for manipulating those circuits and, as such, have been limited to subgroups of the Clifford group. However, universal quantum computers require additional, non-Clifford gates to approximate arbitrary unitary transformations. We define a scalable randomized benchmarking procedure over n-qubit unitary matrices that correspond to protected non-Clifford gates for a class of stabilizer codes. We present efficient methods for representing and composing group elements, sampling them uniformly, and synthesizing corresponding poly (n) -sized circuits. The procedure provides experimental access to two independent parameters that together characterize the average gate fidelity of a group element. We acknowledge support from ARO under Contract W911NF-14-1-0124.

  13. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.

    PubMed

    Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel

    2016-10-01

    Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.

  14. Hospital staff registered nurses' perception of horizontal violence, peer relationships, and the quality and safety of patient care.

    PubMed

    Purpora, Christina; Blegen, Mary A; Stotts, Nancy A

    2015-01-01

    To test hypotheses from a horizontal violence and quality and safety of patient care model: horizontal violence (negative behavior among peers) is inversely related to peer relations, quality of care and it is positively related to errors and adverse events. Additionally, the association between horizontal violence, peer relations, quality of care, errors and adverse events, and nurse and work characteristics were determined. A random sample (n= 175) of hospital staff Registered Nurses working in California. Nurses participated via survey. Bivariate and multivariate analyses tested the study hypotheses. Hypotheses were supported. Horizontal violence was inversely related to peer relations and quality of care, and positively related to errors and adverse events. Including peer relations in the analyses altered the relationship between horizontal violence and quality of care but not between horizontal violence, errors and adverse events. Nurse and hospital characteristics were not related to other variables. Clinical area contributed significantly in predicting the quality of care, errors and adverse events but not peer relationships. Horizontal violence affects peer relationships and the quality and safety of patient care as perceived by participating nurses. Supportive peer relationships are important to mitigate the impact of horizontal violence on quality of care.

  15. An Extended Objective Evaluation of the 29-km Eta Model for Weather Support to the United States Space Program

    NASA Technical Reports Server (NTRS)

    Nutter, Paul; Manobianco, John

    1998-01-01

    This report describes the Applied Meteorology Unit's objective verification of the National Centers for Environmental Prediction 29-km eta model during separate warm and cool season periods from May 1996 through January 1998. The verification of surface and upper-air point forecasts was performed at three selected stations important for 45th Weather Squadron, Spaceflight Meteorology Group, and National Weather Service, Melbourne operational weather concerns. The statistical evaluation identified model biases that may result from inadequate parameterization of physical processes. Since model biases are relatively small compared to the random error component, most of the total model error results from day-to-day variability in the forecasts and/or observations. To some extent, these nonsystematic errors reflect the variability in point observations that sample spatial and temporal scales of atmospheric phenomena that cannot be resolved by the model. On average, Meso-Eta point forecasts provide useful guidance for predicting the evolution of the larger scale environment. A more substantial challenge facing model users in real time is the discrimination of nonsystematic errors that tend to inflate the total forecast error. It is important that model users maintain awareness of ongoing model changes. Such changes are likely to modify the basic error characteristics, particularly near the surface.

  16. QUANTIFYING UNCERTAINTY DUE TO RANDOM ERRORS FOR MOMENT ANALYSES OF BREAKTHROUGH CURVES

    EPA Science Inventory

    The uncertainty in moments calculated from breakthrough curves (BTCs) is investigated as a function of random measurement errors in the data used to define the BTCs. The method presented assumes moments are calculated by numerical integration using the trapezoidal rule, and is t...

  17. Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries

    USGS Publications Warehouse

    McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.

    2013-01-01

    Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.

  18. Quantum supremacy in constant-time measurement-based computation: A unified architecture for sampling and verification

    NASA Astrophysics Data System (ADS)

    Miller, Jacob; Sanders, Stephen; Miyake, Akimasa

    2017-12-01

    While quantum speed-up in solving certain decision problems by a fault-tolerant universal quantum computer has been promised, a timely research interest includes how far one can reduce the resource requirement to demonstrate a provable advantage in quantum devices without demanding quantum error correction, which is crucial for prolonging the coherence time of qubits. We propose a model device made of locally interacting multiple qubits, designed such that simultaneous single-qubit measurements on it can output probability distributions whose average-case sampling is classically intractable, under similar assumptions as the sampling of noninteracting bosons and instantaneous quantum circuits. Notably, in contrast to these previous unitary-based realizations, our measurement-based implementation has two distinctive features. (i) Our implementation involves no adaptation of measurement bases, leading output probability distributions to be generated in constant time, independent of the system size. Thus, it could be implemented in principle without quantum error correction. (ii) Verifying the classical intractability of our sampling is done by changing the Pauli measurement bases only at certain output qubits. Our usage of random commuting quantum circuits in place of computationally universal circuits allows a unique unification of sampling and verification, so they require the same physical resource requirements in contrast to the more demanding verification protocols seen elsewhere in the literature.

  19. Random Versus Nonrandom Peer Review: A Case for More Meaningful Peer Review.

    PubMed

    Itri, Jason N; Donithan, Adam; Patel, Sohil H

    2018-05-10

    Random peer review programs are not optimized to discover cases with diagnostic error and thus have inherent limitations with respect to educational and quality improvement value. Nonrandom peer review offers an alternative approach in which diagnostic error cases are targeted for collection during routine clinical practice. The objective of this study was to compare error cases identified through random and nonrandom peer review approaches at an academic center. During the 1-year study period, the number of discrepancy cases and score of discrepancy were determined from each approach. The nonrandom peer review process collected 190 cases, of which 60 were scored as 2 (minor discrepancy), 94 as 3 (significant discrepancy), and 36 as 4 (major discrepancy). In the random peer review process, 1,690 cases were reviewed, of which 1,646 were scored as 1 (no discrepancy), 44 were scored as 2 (minor discrepancy), and none were scored as 3 or 4. Several teaching lessons and quality improvement measures were developed as a result of analysis of error cases collected through the nonrandom peer review process. Our experience supports the implementation of nonrandom peer review as a replacement to random peer review, with nonrandom peer review serving as a more effective method for collecting diagnostic error cases with educational and quality improvement value. Copyright © 2018 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  20. Mapping from disease-specific measures to health-state utility values in individuals with migraine.

    PubMed

    Gillard, Patrick J; Devine, Beth; Varon, Sepideh F; Liu, Lei; Sullivan, Sean D

    2012-05-01

    The objective of this study was to develop empirical algorithms that estimate health-state utility values from disease-specific quality-of-life scores in individuals with migraine. Data from a cross-sectional, multicountry study were used. Individuals with episodic and chronic migraine were randomly assigned to training or validation samples. Spearman's correlation coefficients between paired EuroQol five-dimensional (EQ-5D) questionnaire utility values and both Headache Impact Test (HIT-6) scores and Migraine-Specific Quality-of-Life Questionnaire version 2.1 (MSQ) domain scores (role restrictive, role preventive, and emotional function) were examined. Regression models were constructed to estimate EQ-5D questionnaire utility values from the HIT-6 score or the MSQ domain scores. Preferred algorithms were confirmed in the validation samples. In episodic migraine, the preferred HIT-6 and MSQ algorithms explained 22% and 25% of the variance (R(2)) in the training samples, respectively, and had similar prediction errors (root mean square errors of 0.30). In chronic migraine, the preferred HIT-6 and MSQ algorithms explained 36% and 45% of the variance in the training samples, respectively, and had similar prediction errors (root mean square errors 0.31 and 0.29). In episodic and chronic migraine, no statistically significant differences were observed between the mean observed and the mean estimated EQ-5D questionnaire utility values for the preferred HIT-6 and MSQ algorithms in the validation samples. The relationship between the EQ-5D questionnaire and the HIT-6 or the MSQ is adequate to use regression equations to estimate EQ-5D questionnaire utility values. The preferred HIT-6 and MSQ algorithms will be useful in estimating health-state utilities in migraine trials in which no preference-based measure is present. Copyright © 2012 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  1. Experimental toxicology: Issues of statistics, experimental design, and replication.

    PubMed

    Briner, Wayne; Kirwan, Jeral

    2017-01-01

    The difficulty of replicating experiments has drawn considerable attention. Issues with replication occur for a variety of reasons ranging from experimental design to laboratory errors to inappropriate statistical analysis. Here we review a variety of guidelines for statistical analysis, design, and execution of experiments in toxicology. In general, replication can be improved by using hypothesis driven experiments with adequate sample sizes, randomization, and blind data collection techniques. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Assessment of pollutant mean concentrations in the Yangtze estuary based on MSN theory.

    PubMed

    Ren, Jing; Gao, Bing-Bo; Fan, Hai-Mei; Zhang, Zhi-Hong; Zhang, Yao; Wang, Jin-Feng

    2016-12-15

    Reliable assessment of water quality is a critical issue for estuaries. Nutrient concentrations show significant spatial distinctions between areas under the influence of fresh-sea water interaction and anthropogenic effects. For this situation, given the limitations of general mean estimation approaches, a new method for surfaces with non-homogeneity (MSN) was applied to obtain optimized linear unbiased estimations of the mean nutrient concentrations in the study area in the Yangtze estuary from 2011 to 2013. Other mean estimation methods, including block Kriging (BK), simple random sampling (SS) and stratified sampling (ST) inference, were applied simultaneously for comparison. Their performance was evaluated by estimation error. The results show that MSN had the highest accuracy, while SS had the highest estimation error. ST and BK were intermediate in terms of their performance. Thus, MSN is an appropriate method that can be adopted to reduce the uncertainty of mean pollutant estimation in estuaries. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Quantification of sewer system infiltration using delta(18)O hydrograph separation.

    PubMed

    Prigiobbe, V; Giulianelli, M

    2009-01-01

    The infiltration of parasitical water into two sewer systems in Rome (Italy) was quantified during a dry weather period. Infiltration was estimated using the hydrograph separation method with two water components and delta(18)O as a conservative tracer. The two water components were groundwater, the possible source of parasitical water within the sewer, and drinking water discharged into the sewer system. This method was applied at an urban catchment scale in order to test the effective water-tightness of two different sewer networks. The sampling strategy was based on an uncertainty analysis and the errors have been propagated using Monte Carlo random sampling. Our field applications showed that the method can be applied easily and quickly, but the error in the estimated infiltration rate can be up to 20%. The estimated infiltration into the recent sewer in Torraccia is 14% and can be considered negligible given the precision of the method, while the old sewer in Infernetto has an estimated infiltration of 50%.

  4. WE-AB-207A-04: Random Undersampled Cone Beam CT: Theoretical Analysis and a Novel Reconstruction Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, C; Chen, L; Jia, X

    2016-06-15

    Purpose: Reducing x-ray exposure and speeding up data acquisition motived studies on projection data undersampling. It is an important question that for a given undersampling ratio, what the optimal undersampling approach is. In this study, we propose a new undersampling scheme: random-ray undersampling. We will mathematically analyze its projection matrix properties and demonstrate its advantages. We will also propose a new reconstruction method that simultaneously performs CT image reconstruction and projection domain data restoration. Methods: By representing projection operator under the basis of singular vectors of full projection operator, matrix representations for an undersampling case can be generated and numericalmore » singular value decomposition can be performed. We compared properties of matrices among three undersampling approaches: regular-view undersampling, regular-ray undersampling, and the proposed random-ray undersampling. To accomplish CT reconstruction for random undersampling, we developed a novel method that iteratively performs CT reconstruction and missing projection data restoration via regularization approaches. Results: For a given undersampling ratio, random-ray undersampling preserved mathematical properties of full projection operator better than the other two approaches. This translates to advantages of reconstructing CT images at lower errors. Different types of image artifacts were observed depending on undersampling strategies, which were ascribed to the unique singular vectors of the sampling operators in the image domain. We tested the proposed reconstruction algorithm on a Forbid phantom with only 30% of the projection data randomly acquired. Reconstructed image error was reduced from 9.4% in a TV method to 7.6% in the proposed method. Conclusion: The proposed random-ray undersampling is mathematically advantageous over other typical undersampling approaches. It may permit better image reconstruction at the same undersampling ratio. The novel algorithm suitable for this random-ray undersampling was able to reconstruct high-quality images.« less

  5. Efficient Measurement of Quantum Gate Error by Interleaved Randomized Benchmarking

    NASA Astrophysics Data System (ADS)

    Magesan, Easwar; Gambetta, Jay M.; Johnson, B. R.; Ryan, Colm A.; Chow, Jerry M.; Merkel, Seth T.; da Silva, Marcus P.; Keefe, George A.; Rothwell, Mary B.; Ohki, Thomas A.; Ketchen, Mark B.; Steffen, M.

    2012-08-01

    We describe a scalable experimental protocol for estimating the average error of individual quantum computational gates. This protocol consists of interleaving random Clifford gates between the gate of interest and provides an estimate as well as theoretical bounds for the average error of the gate under test, so long as the average noise variation over all Clifford gates is small. This technique takes into account both state preparation and measurement errors and is scalable in the number of qubits. We apply this protocol to a superconducting qubit system and find a bounded average error of 0.003 [0,0.016] for the single-qubit gates Xπ/2 and Yπ/2. These bounded values provide better estimates of the average error than those extracted via quantum process tomography.

  6. Efficacy of Visual-Acoustic Biofeedback Intervention for Residual Rhotic Errors: A Single-Subject Randomization Study

    ERIC Educational Resources Information Center

    Byun, Tara McAllister

    2017-01-01

    Purpose: This study documented the efficacy of visual-acoustic biofeedback intervention for residual rhotic errors, relative to a comparison condition involving traditional articulatory treatment. All participants received both treatments in a single-subject experimental design featuring alternating treatments with blocked randomization of…

  7. Statistical Analysis Experiment for Freshman Chemistry Lab.

    ERIC Educational Resources Information Center

    Salzsieder, John C.

    1995-01-01

    Describes a laboratory experiment dissolving zinc from galvanized nails in which data can be gathered very quickly for statistical analysis. The data have sufficient significant figures and the experiment yields a nice distribution of random errors. Freshman students can gain an appreciation of the relationships between random error, number of…

  8. Evaluation of seasonal and spatial variations of lumped water balance model sensitivity to precipitation data errors

    NASA Astrophysics Data System (ADS)

    Xu, Chong-yu; Tunemar, Liselotte; Chen, Yongqin David; Singh, V. P.

    2006-06-01

    Sensitivity of hydrological models to input data errors have been reported in the literature for particular models on a single or a few catchments. A more important issue, i.e. how model's response to input data error changes as the catchment conditions change has not been addressed previously. This study investigates the seasonal and spatial effects of precipitation data errors on the performance of conceptual hydrological models. For this study, a monthly conceptual water balance model, NOPEX-6, was applied to 26 catchments in the Mälaren basin in Central Sweden. Both systematic and random errors were considered. For the systematic errors, 5-15% of mean monthly precipitation values were added to the original precipitation to form the corrupted input scenarios. Random values were generated by Monte Carlo simulation and were assumed to be (1) independent between months, and (2) distributed according to a Gaussian law of zero mean and constant standard deviation that were taken as 5, 10, 15, 20, and 25% of the mean monthly standard deviation of precipitation. The results show that the response of the model parameters and model performance depends, among others, on the type of the error, the magnitude of the error, physical characteristics of the catchment, and the season of the year. In particular, the model appears less sensitive to the random error than to the systematic error. The catchments with smaller values of runoff coefficients were more influenced by input data errors than were the catchments with higher values. Dry months were more sensitive to precipitation errors than were wet months. Recalibration of the model with erroneous data compensated in part for the data errors by altering the model parameters.

  9. Predicting Soil Organic Carbon and Total Nitrogen in the Russian Chernozem from Depth and Wireless Color Sensor Measurements

    NASA Astrophysics Data System (ADS)

    Mikhailova, E. A.; Stiglitz, R. Y.; Post, C. J.; Schlautman, M. A.; Sharp, J. L.; Gerard, P. D.

    2017-12-01

    Color sensor technologies offer opportunities for affordable and rapid assessment of soil organic carbon (SOC) and total nitrogen (TN) in the field, but the applicability of these technologies may vary by soil type. The objective of this study was to use an inexpensive color sensor to develop SOC and TN prediction models for the Russian Chernozem (Haplic Chernozem) in the Kursk region of Russia. Twenty-one dried soil samples were analyzed using a Nix Pro™ color sensor that is controlled through a mobile application and Bluetooth to collect CIEL*a*b* (darkness to lightness, green to red, and blue to yellow) color data. Eleven samples were randomly selected to be used to construct prediction models and the remaining ten samples were set aside for cross validation. The root mean squared error (RMSE) was calculated to determine each model's prediction error. The data from the eleven soil samples were used to develop the natural log of SOC (lnSOC) and TN (lnTN) prediction models using depth, L*, a*, and b* for each sample as predictor variables in regression analyses. Resulting residual plots, root mean square errors (RMSE), mean squared prediction error (MSPE) and coefficients of determination ( R 2, adjusted R 2) were used to assess model fit for each of the SOC and total N prediction models. Final models were fit using all soil samples, which included depth and color variables, for lnSOC ( R 2 = 0.987, Adj. R 2 = 0.981, RMSE = 0.003, p-value < 0.001, MSPE = 0.182) and lnTN ( R 2 = 0.980 Adj. R 2 = 0.972, RMSE = 0.004, p-value < 0.001, MSPE = 0.001). Additionally, final models were fit for all soil samples, which included only color variables, for lnSOC ( R 2 = 0.959 Adj. R 2 = 0.949, RMSE = 0.007, p-value < 0.001, MSPE = 0.536) and lnTN ( R 2 = 0.912 Adj. R 2 = 0.890, RMSE = 0.015, p-value < 0.001, MSPE = 0.001). The results suggest that soil color may be used for rapid assessment of SOC and TN in these agriculturally important soils.

  10. Determination of the precision error of the pulmonary artery thermodilution catheter using an in vitro continuous flow test rig.

    PubMed

    Yang, Xiao-Xing; Critchley, Lester A; Joynt, Gavin M

    2011-01-01

    Thermodilution cardiac output using a pulmonary artery catheter is the reference method against which all new methods of cardiac output measurement are judged. However, thermodilution lacks precision and has a quoted precision error of ± 20%. There is uncertainty about its true precision and this causes difficulty when validating new cardiac output technology. Our aim in this investigation was to determine the current precision error of thermodilution measurements. A test rig through which water circulated at different constant rates with ports to insert catheters into a flow chamber was assembled. Flow rate was measured by an externally placed transonic flowprobe and meter. The meter was calibrated by timed filling of a cylinder. Arrow and Edwards 7Fr thermodilution catheters, connected to a Siemens SC9000 cardiac output monitor, were tested. Thermodilution readings were made by injecting 5 mL of ice-cold water. Precision error was divided into random and systematic components, which were determined separately. Between-readings (random) variability was determined for each catheter by taking sets of 10 readings at different flow rates. Coefficient of variation (CV) was calculated for each set and averaged. Between-catheter systems (systematic) variability was derived by plotting calibration lines for sets of catheters. Slopes were used to estimate the systematic component. Performances of 3 cardiac output monitors were compared: Siemens SC9000, Siemens Sirecust 1261, and Philips MP50. Five Arrow and 5 Edwards catheters were tested using the Siemens SC9000 monitor. Flow rates between 0.7 and 7.0 L/min were studied. The CV (random error) for Arrow was 5.4% and for Edwards was 4.8%. The random precision error was ± 10.0% (95% confidence limits). CV (systematic error) was 5.8% and 6.0%, respectively. The systematic precision error was ± 11.6%. The total precision error of a single thermodilution reading was ± 15.3% and ± 13.0% for triplicate readings. Precision error increased by 45% when using the Sirecust monitor and 100% when using the Philips monitor. In vitro testing of pulmonary artery catheters enabled us to measure both the random and systematic error components of thermodilution cardiac output measurement, and thus calculate the precision error. Using the Siemens monitor, we established a precision error of ± 15.3% for single and ± 13.0% for triplicate reading, which was similar to the previous estimate of ± 20%. However, this precision error was significantly worsened by using the Sirecust and Philips monitors. Clinicians should recognize that the precision error of thermodilution cardiac output is dependent on the selection of catheter and monitor model.

  11. Visual disability, visual function, and myopia among rural chinese secondary school children: the Xichang Pediatric Refractive Error Study (X-PRES)--report 1.

    PubMed

    Congdon, Nathan; Wang, Yunfei; Song, Yue; Choi, Kai; Zhang, Mingzhi; Zhou, Zhongxia; Xie, Zhenling; Li, Liping; Liu, Xueyu; Sharma, Abhishek; Wu, Bin; Lam, Dennis S C

    2008-07-01

    To evaluate visual acuity, visual function, and prevalence of refractive error among Chinese secondary-school children in a cross-sectional school-based study. Uncorrected, presenting, and best corrected visual acuity, cycloplegic autorefraction with refinement, and self-reported visual function were assessed in a random, cluster sample of rural secondary school students in Xichang, China. Among the 1892 subjects (97.3% of the consenting children, 84.7% of the total sample), mean age was 14.7 +/- 0.8 years, 51.2% were female, and 26.4% were wearing glasses. The proportion of children with uncorrected, presenting, and corrected visual disability (< or = 6/12 in the better eye) was 41.2%, 19.3%, and 0.5%, respectively. Myopia < -0.5, < -2.0, and < -6.0 D in both eyes was present in 62.3%, 31.1%, and 1.9% of the subjects, respectively. Among the children with visual disability when tested without correction, 98.7% was due to refractive error, while only 53.8% (414/770) of these children had appropriate correction. The girls had significantly (P < 0.001) more presenting visual disability and myopia < -2.0 D than did the boys. More myopic refractive error was associated with worse self-reported visual function (ANOVA trend test, P < 0.001). Visual disability in this population was common, highly correctable, and frequently uncorrected. The impact of refractive error on self-reported visual function was significant. Strategies and studies to understand and remove barriers to spectacle wear are needed.

  12. Laboratory issues: use of nutritional biomarkers.

    PubMed

    Blanck, Heidi Michels; Bowman, Barbara A; Cooper, Gerald R; Myers, Gary L; Miller, Dayton T

    2003-03-01

    Biomarkers of nutritional status provide alternative measures of dietary intake. Like the error and variation associated with dietary intake measures, the magnitude and impact of both biological (preanalytical) and laboratory (analytical) variability need to be considered when one is using biomarkers. When choosing a biomarker, it is important to understand how it relates to nutritional intake and the specific time frame of exposure it reflects as well as how it is affected by sampling and laboratory procedures. Biological sources of variation that arise from genetic and disease states of an individual affect biomarkers, but they are also affected by nonbiological sources of variation arising from specimen collection and storage, seasonality, time of day, contamination, stability and laboratory quality assurance. When choosing a laboratory for biomarker assessment, researchers should try to make sure random and systematic error is minimized by inclusion of certain techniques such as blinding of laboratory staff to disease status and including external pooled standards to which laboratory staff are blinded. In addition analytic quality control should be ensured by use of internal standards or certified materials over the entire range of possible values to control method accuracy. One must consider the effect of random laboratory error on measurement precision and also understand the method's limit of detection and the laboratory cutpoints. Choosing appropriate cutpoints and reducing error is extremely important in nutritional epidemiology where weak associations are frequent. As part of this review, serum lipids are included as an example of a biomarker whereby collaborative efforts have been put forth to both understand biological sources of variation and standardize laboratory results.

  13. Estimation of infection prevalence and sensitivity in a stratified two-stage sampling design employing highly specific diagnostic tests when there is no gold standard.

    PubMed

    Miller, Ezer; Huppert, Amit; Novikov, Ilya; Warburg, Alon; Hailu, Asrat; Abbasi, Ibrahim; Freedman, Laurence S

    2015-11-10

    In this work, we describe a two-stage sampling design to estimate the infection prevalence in a population. In the first stage, an imperfect diagnostic test was performed on a random sample of the population. In the second stage, a different imperfect test was performed in a stratified random sample of the first sample. To estimate infection prevalence, we assumed conditional independence between the diagnostic tests and develop method of moments estimators based on expectations of the proportions of people with positive and negative results on both tests that are functions of the tests' sensitivity, specificity, and the infection prevalence. A closed-form solution of the estimating equations was obtained assuming a specificity of 100% for both tests. We applied our method to estimate the infection prevalence of visceral leishmaniasis according to two quantitative polymerase chain reaction tests performed on blood samples taken from 4756 patients in northern Ethiopia. The sensitivities of the tests were also estimated, as well as the standard errors of all estimates, using a parametric bootstrap. We also examined the impact of departures from our assumptions of 100% specificity and conditional independence on the estimated prevalence. Copyright © 2015 John Wiley & Sons, Ltd.

  14. On-board error correction improves IR earth sensor accuracy

    NASA Astrophysics Data System (ADS)

    Alex, T. K.; Kasturirangan, K.; Shrivastava, S. K.

    1989-10-01

    Infra-red earth sensors are used in satellites for attitude sensing. Their accuracy is limited by systematic and random errors. The sources of errors in a scanning infra-red earth sensor are analyzed in this paper. The systematic errors arising from seasonal variation of infra-red radiation, oblate shape of the earth, ambient temperature of sensor, changes in scan/spin rates have been analyzed. Simple relations are derived using least square curve fitting for on-board correction of these errors. Random errors arising out of noise from detector and amplifiers, instability of alignment and localized radiance anomalies are analyzed and possible correction methods are suggested. Sun and Moon interference on earth sensor performance has seriously affected a number of missions. The on-board processor detects Sun/Moon interference and corrects the errors on-board. It is possible to obtain eight times improvement in sensing accuracy, which will be comparable with ground based post facto attitude refinement.

  15. Application of random effects to the study of resource selection by animals

    USGS Publications Warehouse

    Gillies, C.S.; Hebblewhite, M.; Nielsen, S.E.; Krawchuk, M.A.; Aldridge, Cameron L.; Frair, J.L.; Saher, D.J.; Stevens, C.E.; Jerde, C.L.

    2006-01-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence.2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability.3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed.4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects.5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection.6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  16. Application of random effects to the study of resource selection by animals.

    PubMed

    Gillies, Cameron S; Hebblewhite, Mark; Nielsen, Scott E; Krawchuk, Meg A; Aldridge, Cameron L; Frair, Jacqueline L; Saher, D Joanne; Stevens, Cameron E; Jerde, Christopher L

    2006-07-01

    1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence. 2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability. 3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed. 4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects. 5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection. 6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

  17. Comparison of Accuracy in Intraocular Lens Power Calculation by Measuring Axial Length with Immersion Ultrasound Biometry and Partial Coherence Interferometry.

    PubMed

    Ruangsetakit, Varee

    2015-11-01

    To re-examine relative accuracy of intraocular lens (IOL) power calculation of immersion ultrasound biometry (IUB) and partial coherence interferometry (PCI) based on a new approach that limits its interest on the cases in which the IUB's IOL and PCI's IOL assignments disagree. Prospective observational study of 108 eyes that underwent cataract surgeries at Taksin Hospital. Two halves ofthe randomly chosen sample eyes were implanted with the IUB- and PCI-assigned lens. Postoperative refractive errors were measured in the fifth week. More accurate calculation was based on significantly smaller mean absolute errors (MAEs) and root mean squared errors (RMSEs) away from emmetropia. The distributions of the errors were examined to ensure that the higher accuracy was significant clinically as well. The (MAEs, RMSEs) were smaller for PCI of (0.5106 diopter (D), 0.6037D) than for IUB of (0.7000D, 0.8062D). The higher accuracy was principally contributedfrom negative errors, i.e., myopia. The MAEs and RMSEs for (IUB, PCI)'s negative errors were (0.7955D, 0.5185D) and (0.8562D, 0.5853D). Their differences were significant. The 72.34% of PCI errors fell within a clinically accepted range of ± 0.50D, whereas 50% of IUB errors did. PCI's higher accuracy was significant statistically and clinically, meaning that lens implantation based on PCI's assignments could improve postoperative outcomes over those based on IUB's assignments.

  18. Ant-inspired density estimation via random walks.

    PubMed

    Musco, Cameron; Su, Hsin-Hao; Lynch, Nancy A

    2017-10-03

    Many ant species use distributed population density estimation in applications ranging from quorum sensing, to task allocation, to appraisal of enemy colony strength. It has been shown that ants estimate local population density by tracking encounter rates: The higher the density, the more often the ants bump into each other. We study distributed density estimation from a theoretical perspective. We prove that a group of anonymous agents randomly walking on a grid are able to estimate their density within a small multiplicative error in few steps by measuring their rates of encounter with other agents. Despite dependencies inherent in the fact that nearby agents may collide repeatedly (and, worse, cannot recognize when this happens), our bound nearly matches what would be required to estimate density by independently sampling grid locations. From a biological perspective, our work helps shed light on how ants and other social insects can obtain relatively accurate density estimates via encounter rates. From a technical perspective, our analysis provides tools for understanding complex dependencies in the collision probabilities of multiple random walks. We bound the strength of these dependencies using local mixing properties of the underlying graph. Our results extend beyond the grid to more general graphs, and we discuss applications to size estimation for social networks, density estimation for robot swarms, and random walk-based sampling for sensor networks.

  19. a Weighted Closed-Form Solution for Rgb-D Data Registration

    NASA Astrophysics Data System (ADS)

    Vestena, K. M.; Dos Santos, D. R.; Oilveira, E. M., Jr.; Pavan, N. L.; Khoshelham, K.

    2016-06-01

    Existing 3D indoor mapping of RGB-D data are prominently point-based and feature-based methods. In most cases iterative closest point (ICP) and its variants are generally used for pairwise registration process. Considering that the ICP algorithm requires an relatively accurate initial transformation and high overlap a weighted closed-form solution for RGB-D data registration is proposed. In this solution, we weighted and normalized the 3D points based on the theoretical random errors and the dual-number quaternions are used to represent the 3D rigid body motion. Basically, dual-number quaternions provide a closed-form solution by minimizing a cost function. The most important advantage of the closed-form solution is that it provides the optimal transformation in one-step, it does not need to calculate good initial estimates and expressively decreases the demand for computer resources in contrast to the iterative method. Basically, first our method exploits RGB information. We employed a scale invariant feature transformation (SIFT) for extracting, detecting, and matching features. It is able to detect and describe local features that are invariant to scaling and rotation. To detect and filter outliers, we used random sample consensus (RANSAC) algorithm, jointly with an statistical dispersion called interquartile range (IQR). After, a new RGB-D loop-closure solution is implemented based on the volumetric information between pair of point clouds and the dispersion of the random errors. The loop-closure consists to recognize when the sensor revisits some region. Finally, a globally consistent map is created to minimize the registration errors via a graph-based optimization. The effectiveness of the proposed method is demonstrated with a Kinect dataset. The experimental results show that the proposed method can properly map the indoor environment with an absolute accuracy around 1.5% of the travel of a trajectory.

  20. A novel approach to evaluation of pest insect abundance in the presence of noise.

    PubMed

    Embleton, Nina; Petrovskaya, Natalia

    2014-03-01

    Evaluation of pest abundance is an important task of integrated pest management. It has recently been shown that evaluation of pest population size from discrete sampling data can be done by using the ideas of numerical integration. Numerical integration of the pest population density function is a computational technique that readily gives us an estimate of the pest population size, where the accuracy of the estimate depends on the number of traps installed in the agricultural field to collect the data. However, in a standard mathematical problem of numerical integration, it is assumed that the data are precise, so that the random error is zero when the data are collected. This assumption does not hold in ecological applications. An inherent random error is often present in field measurements, and therefore it may strongly affect the accuracy of evaluation. In our paper, we offer a novel approach to evaluate the pest insect population size under the assumption that the data about the pest population include a random error. The evaluation is not based on statistical methods but is done using a spatially discrete method of numerical integration where the data obtained by trapping as in pest insect monitoring are converted to values of the population density. It will be discussed in the paper how the accuracy of evaluation differs from the case where the same evaluation method is employed to handle precise data. We also consider how the accuracy of the pest insect abundance evaluation can be affected by noise when the data available from trapping are sparse. In particular, we show that, contrary to intuitive expectations, noise does not have any considerable impact on the accuracy of evaluation when the number of traps is small as is conventional in ecological applications.

  1. Model-based quantification of image quality

    NASA Technical Reports Server (NTRS)

    Hazra, Rajeeb; Miller, Keith W.; Park, Stephen K.

    1989-01-01

    In 1982, Park and Schowengerdt published an end-to-end analysis of a digital imaging system quantifying three principal degradation components: (1) image blur - blurring caused by the acquisition system, (2) aliasing - caused by insufficient sampling, and (3) reconstruction blur - blurring caused by the imperfect interpolative reconstruction. This analysis, which measures degradation as the square of the radiometric error, includes the sample-scene phase as an explicit random parameter and characterizes the image degradation caused by imperfect acquisition and reconstruction together with the effects of undersampling and random sample-scene phases. In a recent paper Mitchell and Netravelli displayed the visual effects of the above mentioned degradations and presented subjective analysis about their relative importance in determining image quality. The primary aim of the research is to use the analysis of Park and Schowengerdt to correlate their mathematical criteria for measuring image degradations with subjective visual criteria. Insight gained from this research can be exploited in the end-to-end design of optical systems, so that system parameters (transfer functions of the acquisition and display systems) can be designed relative to each other, to obtain the best possible results using quantitative measurements.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Lin, E-mail: godyalin@163.com; Singh, Uttam, E-mail: uttamsingh@hri.res.in; Pati, Arun K., E-mail: akpati@hri.res.in

    Compact expressions for the average subentropy and coherence are obtained for random mixed states that are generated via various probability measures. Surprisingly, our results show that the average subentropy of random mixed states approaches the maximum value of the subentropy which is attained for the maximally mixed state as we increase the dimension. In the special case of the random mixed states sampled from the induced measure via partial tracing of random bipartite pure states, we establish the typicality of the relative entropy of coherence for random mixed states invoking the concentration of measure phenomenon. Our results also indicate thatmore » mixed quantum states are less useful compared to pure quantum states in higher dimension when we extract quantum coherence as a resource. This is because of the fact that average coherence of random mixed states is bounded uniformly, however, the average coherence of random pure states increases with the increasing dimension. As an important application, we establish the typicality of relative entropy of entanglement and distillable entanglement for a specific class of random bipartite mixed states. In particular, most of the random states in this specific class have relative entropy of entanglement and distillable entanglement equal to some fixed number (to within an arbitrary small error), thereby hugely reducing the complexity of computation of these entanglement measures for this specific class of mixed states.« less

  3. Correcting the Standard Errors of 2-Stage Residual Inclusion Estimators for Mendelian Randomization Studies

    PubMed Central

    Palmer, Tom M; Holmes, Michael V; Keating, Brendan J; Sheehan, Nuala A

    2017-01-01

    Abstract Mendelian randomization studies use genotypes as instrumental variables to test for and estimate the causal effects of modifiable risk factors on outcomes. Two-stage residual inclusion (TSRI) estimators have been used when researchers are willing to make parametric assumptions. However, researchers are currently reporting uncorrected or heteroscedasticity-robust standard errors for these estimates. We compared several different forms of the standard error for linear and logistic TSRI estimates in simulations and in real-data examples. Among others, we consider standard errors modified from the approach of Newey (1987), Terza (2016), and bootstrapping. In our simulations Newey, Terza, bootstrap, and corrected 2-stage least squares (in the linear case) standard errors gave the best results in terms of coverage and type I error. In the real-data examples, the Newey standard errors were 0.5% and 2% larger than the unadjusted standard errors for the linear and logistic TSRI estimators, respectively. We show that TSRI estimators with modified standard errors have correct type I error under the null. Researchers should report TSRI estimates with modified standard errors instead of reporting unadjusted or heteroscedasticity-robust standard errors. PMID:29106476

  4. Flux control coefficients determined by inhibitor titration: the design and analysis of experiments to minimize errors.

    PubMed Central

    Small, J R

    1993-01-01

    This paper is a study into the effects of experimental error on the estimated values of flux control coefficients obtained using specific inhibitors. Two possible techniques for analysing the experimental data are compared: a simple extrapolation method (the so-called graph method) and a non-linear function fitting method. For these techniques, the sources of systematic errors are identified and the effects of systematic and random errors are quantified, using both statistical analysis and numerical computation. It is shown that the graph method is very sensitive to random errors and, under all conditions studied, that the fitting method, even under conditions where the assumptions underlying the fitted function do not hold, outperformed the graph method. Possible ways of designing experiments to minimize the effects of experimental errors are analysed and discussed. PMID:8257434

  5. Instrumentation of the variable-angle magneto-optic ellipsometer and its application to M-O media and other non-magnetic films

    NASA Technical Reports Server (NTRS)

    Zhou, Andy F.; Erwin, J. Kevin; Mansuripur, M.

    1992-01-01

    A new and comprehensive dielectric tensor characterization instrument is presented for characterization of magneto-optical recording media and non-magnetic thin films. Random and systematic errors of the system are studied. A series of TbFe, TbFeCo, and Co/Pt samples with different composition and thicknesses are characterized for their optical and magneto-optical properties. The optical properties of several non-magnetic films are also measured.

  6. Effects of Multipath and Oversampling on Navigation Using Orthogonal Frequency Division Multiplexed Signals of Opportunity

    DTIC Science & Technology

    2008-03-01

    for military use. The L2 carrier frequency operates at 1227.6 MHz and transmits only the precise code . Each satellite transmits a unique pseudo ...random noise (PRN) code by which it is identified. GPS receivers require a LOS to four satellite signals to accurately estimate a position in three...receiver frequency errors, noise addition, and multipath ef- fects. He also developed four methods for estimating the cross- correlation peak within a sampled

  7. Evaluation of random errors in Williams’ series coefficients obtained with digital image correlation

    NASA Astrophysics Data System (ADS)

    Lychak, Oleh V.; Holyns'kiy, Ivan S.

    2016-03-01

    The use of the Williams’ series parameters for fracture analysis requires valid information about their error values. The aim of this investigation is the development of the method for estimation of the standard deviation of random errors of the Williams’ series parameters, obtained from the measured components of the stress field. Also, the criteria for choosing the optimal number of terms in the truncated Williams’ series for derivation of their parameters with minimal errors is proposed. The method was used for the evaluation of the Williams’ parameters, obtained from the data, and measured by the digital image correlation technique for testing a three-point bending specimen.

  8. Large Uncertainty in Estimating pCO2 From Carbonate Equilibria in Lakes

    NASA Astrophysics Data System (ADS)

    Golub, Malgorzata; Desai, Ankur R.; McKinley, Galen A.; Remucal, Christina K.; Stanley, Emily H.

    2017-11-01

    Most estimates of carbon dioxide (CO2) evasion from freshwaters rely on calculating partial pressure of aquatic CO2 (pCO2) from two out of three CO2-related parameters using carbonate equilibria. However, the pCO2 uncertainty has not been systematically evaluated across multiple lake types and equilibria. We quantified random errors in pH, dissolved inorganic carbon, alkalinity, and temperature from the North Temperate Lakes Long-Term Ecological Research site in four lake groups across a broad gradient of chemical composition. These errors were propagated onto pCO2 calculated from three carbonate equilibria, and for overlapping observations, compared against uncertainties in directly measured pCO2. The empirical random errors in CO2-related parameters were mostly below 2% of their median values. Resulting random pCO2 errors ranged from ±3.7% to ±31.5% of the median depending on alkalinity group and choice of input parameter pairs. Temperature uncertainty had a negligible effect on pCO2. When compared with direct pCO2 measurements, all parameter combinations produced biased pCO2 estimates with less than one third of total uncertainty explained by random pCO2 errors, indicating that systematic uncertainty dominates over random error. Multidecadal trend of pCO2 was difficult to reconstruct from uncertain historical observations of CO2-related parameters. Given poor precision and accuracy of pCO2 estimates derived from virtually any combination of two CO2-related parameters, we recommend direct pCO2 measurements where possible. To achieve consistently robust estimates of CO2 emissions from freshwater components of terrestrial carbon balances, future efforts should focus on improving accuracy and precision of CO2-related parameters (including direct pCO2) measurements and associated pCO2 calculations.

  9. Optimal estimation for discrete time jump processes

    NASA Technical Reports Server (NTRS)

    Vaca, M. V.; Tretter, S. A.

    1977-01-01

    Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are obtained. The approach is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. A general representation for optimum estimates and recursive equations for minimum mean squared error (MMSE) estimates are obtained. MMSE estimates are nonlinear functions of the observations. The problem of estimating the rate of a DTJP when the rate is a random variable with a probability density function of the form cx super K (l-x) super m and show that the MMSE estimates are linear in this case. This class of density functions explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.

  10. Optimal estimation for discrete time jump processes

    NASA Technical Reports Server (NTRS)

    Vaca, M. V.; Tretter, S. A.

    1978-01-01

    Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are derived. The approach used is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. Thus a general representation is obtained for optimum estimates, and recursive equations are derived for minimum mean-squared error (MMSE) estimates. In general, MMSE estimates are nonlinear functions of the observations. The problem is considered of estimating the rate of a DTJP when the rate is a random variable with a beta probability density function and the jump amplitudes are binomially distributed. It is shown that the MMSE estimates are linear. The class of beta density functions is rather rich and explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.

  11. Automation of POST Cases via External Optimizer and "Artificial p2" Calculation

    NASA Technical Reports Server (NTRS)

    Dees, Patrick D.; Zwack, Mathew R.; Michelson, Diane K.

    2017-01-01

    During conceptual design speed and accuracy are often at odds. Specifically in the realm of launch vehicles, optimizing the ascent trajectory requires a larger pool of analytical power and expertise. Experienced analysts working on familiar vehicles can produce optimal trajectories in a short time frame, however whenever either "experienced" or "familiar " is not applicable the optimization process can become quite lengthy. In order to construct a vehicle agnostic method an established global optimization algorithm is needed. In this work the authors develop an "artificial" error term to map arbitrary control vectors to non-zero error by which a global method can operate. Two global methods are compared alongside Design of Experiments and random sampling and are shown to produce comparable results to analysis done by a human expert.

  12. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer

    PubMed Central

    Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant

    2015-01-01

    Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029

  13. An Automatic Quality Control Pipeline for High-Throughput Screening Hit Identification.

    PubMed

    Zhai, Yufeng; Chen, Kaisheng; Zhong, Yang; Zhou, Bin; Ainscow, Edward; Wu, Ying-Ta; Zhou, Yingyao

    2016-09-01

    The correction or removal of signal errors in high-throughput screening (HTS) data is critical to the identification of high-quality lead candidates. Although a number of strategies have been previously developed to correct systematic errors and to remove screening artifacts, they are not universally effective and still require fair amount of human intervention. We introduce a fully automated quality control (QC) pipeline that can correct generic interplate systematic errors and remove intraplate random artifacts. The new pipeline was first applied to ~100 large-scale historical HTS assays; in silico analysis showed auto-QC led to a noticeably stronger structure-activity relationship. The method was further tested in several independent HTS runs, where QC results were sampled for experimental validation. Significantly increased hit confirmation rates were obtained after the QC steps, confirming that the proposed method was effective in enriching true-positive hits. An implementation of the algorithm is available to the screening community. © 2016 Society for Laboratory Automation and Screening.

  14. Quality Assurance of NCI Thesaurus by Mining Structural-Lexical Patterns

    PubMed Central

    Abeysinghe, Rashmie; Brooks, Michael A.; Talbert, Jeffery; Licong, Cui

    2017-01-01

    Quality assurance of biomedical terminologies such as the National Cancer Institute (NCI) Thesaurus is an essential part of the terminology management lifecycle. We investigate a structural-lexical approach based on non-lattice subgraphs to automatically identify missing hierarchical relations and missing concepts in the NCI Thesaurus. We mine six structural-lexical patterns exhibiting in non-lattice subgraphs: containment, union, intersection, union-intersection, inference-contradiction, and inference union. Each pattern indicates a potential specific type of error and suggests a potential type of remediation. We found 809 non-lattice subgraphs with these patterns in the NCI Thesaurus (version 16.12d). Domain experts evaluated a random sample of 50 small non-lattice subgraphs, of which 33 were confirmed to contain errors and make correct suggestions (33/50 = 66%). Of the 25 evaluated subgraphs revealing multiple patterns, 22 were verified correct (22/25 = 88%). This shows the effectiveness of our structurallexical-pattern-based approach in detecting errors and suggesting remediations in the NCI Thesaurus. PMID:29854100

  15. Creating a Satellite-Based Record of Tropospheric Ozone

    NASA Technical Reports Server (NTRS)

    Oetjen, Hilke; Payne, Vivienne H.; Kulawik, Susan S.; Eldering, Annmarie; Worden, John; Edwards, David P.; Francis, Gene L.; Worden, Helen M.

    2013-01-01

    The TES retrieval algorithm has been applied to IASI radiances. We compare the retrieved ozone profiles with ozone sonde profiles for mid-latitudes for the year 2008. We find a positive bias in the IASI ozone profiles in the UTLS region of up to 22 %. The spatial coverage of the IASI instrument allows sampling of effectively the same air mass with several IASI scenes simultaneously. Comparisons of the root-mean-square of an ensemble of IASI profiles to theoretical errors indicate that the measurement noise and the interference of temperature and water vapour on the retrieval together mostly explain the empirically derived random errors. The total degrees of freedom for signal of the retrieval for ozone are 3.1 +/- 0.2 and the tropospheric degrees of freedom are 1.0 +/- 0.2 for the described cases. IASI ozone profiles agree within the error bars with coincident ozone profiles derived from a TES stare sequence for the ozone sonde station at Bratt's Lake (50.2 deg N, 104.7 deg W).

  16. Obligation towards medical errors disclosure at a tertiary care hospital in Dubai, UAE

    PubMed Central

    Zaghloul, Ashraf Ahmad; Rahman, Syed Azizur; Abou El-Enein, Nagwa Younes

    2016-01-01

    OBJECTIVE: The study aimed to identify healthcare providers’ obligation towards medical errors disclosure as well as to study the association between the severity of the medical error and the intention to disclose the error to the patients and their families. DESIGN: A cross-sectional study design was followed to identify the magnitude of disclosure among healthcare providers in different departments at a randomly selected tertiary care hospital in Dubai. SETTING AND PARTICIPANTS: The total sample size accounted for 106 respondents. Data were collected using a questionnaire composed of two sections namely; demographic variables of the respondents and a section which included variables relevant to medical error disclosure. RESULTS: Statistical analysis yielded significant association between the obligation to disclose medical errors with male healthcare providers (X2 = 5.1), and being a physician (X2 = 19.3). Obligation towards medical errors disclosure was significantly associated with those healthcare providers who had not committed any medical errors during the past year (X2 = 9.8), and any type of medical error regardless the cause, extent of harm (X2 = 8.7). Variables included in the binary logistic regression model were; status (Exp β (Physician) = 0.39, 95% CI 0.16–0.97), gender (Exp β (Male) = 4.81, 95% CI 1.84–12.54), and medical errors during the last year (Exp β (None) = 2.11, 95% CI 0.6–2.3). CONCLUSION: Education and training of physicians about disclosure conversations needs to start as early as medical school. Like the training in other competencies required of physicians, education in communicating about medical errors could help reduce physicians’ apprehension and make them more comfortable with disclosure conversations. PMID:27567766

  17. Characterization of addressability by simultaneous randomized benchmarking.

    PubMed

    Gambetta, Jay M; Córcoles, A D; Merkel, S T; Johnson, B R; Smolin, John A; Chow, Jerry M; Ryan, Colm A; Rigetti, Chad; Poletto, S; Ohki, Thomas A; Ketchen, Mark B; Steffen, M

    2012-12-14

    The control and handling of errors arising from cross talk and unwanted interactions in multiqubit systems is an important issue in quantum information processing architectures. We introduce a benchmarking protocol that provides information about the amount of addressability present in the system and implement it on coupled superconducting qubits. The protocol consists of randomized benchmarking experiments run both individually and simultaneously on pairs of qubits. A relevant figure of merit for the addressability is then related to the differences in the measured average gate fidelities in the two experiments. We present results from two similar samples with differing cross talk and unwanted qubit-qubit interactions. The results agree with predictions based on simple models of the classical cross talk and Stark shifts.

  18. Artificial neural network implementation of a near-ideal error prediction controller

    NASA Technical Reports Server (NTRS)

    Mcvey, Eugene S.; Taylor, Lynore Denise

    1992-01-01

    A theory has been developed at the University of Virginia which explains the effects of including an ideal predictor in the forward loop of a linear error-sampled system. It has been shown that the presence of this ideal predictor tends to stabilize the class of systems considered. A prediction controller is merely a system which anticipates a signal or part of a signal before it actually occurs. It is understood that an exact prediction controller is physically unrealizable. However, in systems where the input tends to be repetitive or limited, (i.e., not random) near ideal prediction is possible. In order for the controller to act as a stability compensator, the predictor must be designed in a way that allows it to learn the expected error response of the system. In this way, an unstable system will become stable by including the predicted error in the system transfer function. Previous and current prediction controller include pattern recognition developments and fast-time simulation which are applicable to the analysis of linear sampled data type systems. The use of pattern recognition techniques, along with a template matching scheme, has been proposed as one realizable type of near-ideal prediction. Since many, if not most, systems are repeatedly subjected to similar inputs, it was proposed that an adaptive mechanism be used to 'learn' the correct predicted error response. Once the system has learned the response of all the expected inputs, it is necessary only to recognize the type of input with a template matching mechanism and then to use the correct predicted error to drive the system. Suggested here is an alternate approach to the realization of a near-ideal error prediction controller, one designed using Neural Networks. Neural Networks are good at recognizing patterns such as system responses, and the back-propagation architecture makes use of a template matching scheme. In using this type of error prediction, it is assumed that the system error responses be known for a particular input and modeled plant. These responses are used in the error prediction controller. An analysis was done on the general dynamic behavior that results from including a digital error predictor in a control loop and these were compared to those including the near-ideal Neural Network error predictor. This analysis was done for a second and third order system.

  19. Prevalence of refractive errors in a Brazilian population: the Botucatu eye study.

    PubMed

    Schellini, Silvana Artioli; Durkin, Shane R; Hoyama, Erika; Hirai, Flavio; Cordeiro, Ricardo; Casson, Robert J; Selva, Dinesh; Padovani, Carlos Roberto

    2009-01-01

    To determine the prevalence and demographic associations of refractive error in Botucatu, Brazil. A population-based, cross-sectional prevalence study was conducted, which involved random, household cluster sampling of an urban Brazilian population in Botucatu. There were 3000 individuals aged 1 to 91 years (mean 38.3) who were eligible to participate in the study. Refractive error measurements were obtained by objective refraction. Objective refractive error examinations were performed on 2454 residents within this sample (81.8% of eligible participants). The mean age was 38 years (standard deviation (SD) 20.8 years, Range 1 to 91) and females comprised 57.5% of the study population. Myopia (spherical equivalent (SE) < -0.5 dropters (D)) was most prevalent among those aged 30-39 years (29.7%; 95% confidence interval (CI) 24.8-35.1) and least prevalent among children under 10 years (3.8%; 95% confidence interval (CI) 1.6-7.3). Conversely hypermetropia (SE > 0.5D) was most prevalent among participants under 10 years (86.9%; 95% CI 81.6-91.1) and least prevalent in the fourth decade (32.5%; 95% CI 28.2-37.0). Participants aged 70 years or older bore the largest burden of astigmatism (cylinder at least -0.5D) and anisometropia (difference in SE of > 0.5D) with a prevalence of 71.7% (95% CI 64.8-78.0) 55.0% (95% CI 47.6-62.2) respectively. Myopia and hypermetropia were significantly associated with age in a bimodal manner (P < 0.001), whereas anisometropia and astigmatism increased in line with age (P < 0.001). Multivariate modeling confirmed age-related risk factors for refractive error and revealed several gender, occupation and ethnic-related risk factors. These results represent previously unreported data on refractive error within this Brazilian population. They signal a need to continue to screen for refractive error within this population and to ensure that people have adequate access to optical correction.

  20. Traditional Nurse Triage vs. Physician Tele-Presence in a Pediatric Emergency Department

    PubMed Central

    Marconi, Greg P.; Chang, Todd; Pham, Phung K.; Grajower, Daniel N.; Nager, Alan L.

    2014-01-01

    Objectives To compare traditional nurse triage (TNT) in a Pediatric Emergency Department (PED) to physician tele-presence (PTP). Methods Prospective, 2×2 crossover study with random assignment using a sample of walk-in patients seeking care in a PED at a large, tertiary care children’s hospital, from May 2012 to January 2013. Outcomes of triage times, documentation errors, triage scores, and survey responses were compared between TNT and PTP. Comparison between PTP to actual treating PED physicians regarding the accuracy of ordering blood and urine tests, throat cultures, and radiologic imaging was also studied. Results Paired samples t-tests showed a statistically significant difference in triage time between TNT and PTP (p=0.03), but no significant difference in documentation errors (p=0.10). Triage scores of TNT were 71% accurate, compared to PTP, which were 95% accurate. Both parents and children had favorable scores regarding PTP and the majority indicated they would prefer PTP again at their next PED visit. PTP diagnostic ordering was comparable to the actual PED physician ordering, showing no statistical differences. Conclusions Utilizing physician tele-presence technology to remotely perform triage is a feasible alternative to traditional nurse triage, with no clinically significant differences in time, triage scores, errors and patient and parent satisfaction. PMID:24445223

  1. Evaluation of single-point sampling strategies for the estimation of moclobemide exposure in depressive patients.

    PubMed

    Ignjatovic, Anita Rakic; Miljkovic, Branislava; Todorovic, Dejan; Timotijevic, Ivana; Pokrajac, Milena

    2011-05-01

    Because moclobemide pharmacokinetics vary considerably among individuals, monitoring of plasma concentrations lends insight into its pharmacokinetic behavior and enhances its rational use in clinical practice. The aim of this study was to evaluate whether single concentration-time points could adequately predict moclobemide systemic exposure. Pharmacokinetic data (full 7-point pharmacokinetic profiles), obtained from 21 depressive inpatients receiving moclobemide (150 mg 3 times daily), were randomly split into development (n = 18) and validation (n = 16) sets. Correlations between the single concentration-time points and the area under the concentration-time curve within a 6-hour dosing interval at steady-state (AUC(0-6)) were assessed by linear regression analyses. The predictive performance of single-point sampling strategies was evaluated in the validation set by mean prediction error, mean absolute error, and root mean square error. Plasma concentrations in the absorption phase yielded unsatisfactory predictions of moclobemide AUC(0-6). The best estimation of AUC(0-6) was achieved from concentrations at 4 and 6 hours following dosing. As the most reliable surrogate for moclobemide systemic exposure, concentrations at 4 and 6 hours should be used instead of predose trough concentrations as an indicator of between-patient variability and a guide for dose adjustments in specific clinical situations.

  2. Precipitation and Latent Heating Distributions from Satellite Passive Microwave Radiometry. Part II: Evaluation of Estimates Using Independent Data

    NASA Technical Reports Server (NTRS)

    Yang, Song; Olson, William S.; Wang, Jian-Jian; Bell, Thomas L.; Smith, Eric A.; Kummerow, Christian D.

    2006-01-01

    Rainfall rate estimates from spaceborne microwave radiometers are generally accepted as reliable by a majority of the atmospheric science community. One of the Tropical Rainfall Measuring Mission (TRMM) facility rain-rate algorithms is based upon passive microwave observations from the TRMM Microwave Imager (TMI). In Part I of this series, improvements of the TMI algorithm that are required to introduce latent heating as an additional algorithm product are described. Here, estimates of surface rain rate, convective proportion, and latent heating are evaluated using independent ground-based estimates and satellite products. Instantaneous, 0.5 deg. -resolution estimates of surface rain rate over ocean from the improved TMI algorithm are well correlated with independent radar estimates (r approx. 0.88 over the Tropics), but bias reduction is the most significant improvement over earlier algorithms. The bias reduction is attributed to the greater breadth of cloud-resolving model simulations that support the improved algorithm and the more consistent and specific convective/stratiform rain separation method utilized. The bias of monthly 2.5 -resolution estimates is similarly reduced, with comparable correlations to radar estimates. Although the amount of independent latent heating data is limited, TMI-estimated latent heating profiles compare favorably with instantaneous estimates based upon dual-Doppler radar observations, and time series of surface rain-rate and heating profiles are generally consistent with those derived from rawinsonde analyses. Still, some biases in profile shape are evident, and these may be resolved with (a) additional contextual information brought to the estimation problem and/or (b) physically consistent and representative databases supporting the algorithm. A model of the random error in instantaneous 0.5 deg. -resolution rain-rate estimates appears to be consistent with the levels of error determined from TMI comparisons with collocated radar. Error model modifications for nonraining situations will be required, however. Sampling error represents only a portion of the total error in monthly 2.5 -resolution TMI estimates; the remaining error is attributed to random and systematic algorithm errors arising from the physical inconsistency and/or nonrepresentativeness of cloud-resolving-model-simulated profiles that support the algorithm.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Juan; Beltran, Chris J., E-mail: beltran.chris@mayo.edu; Herman, Michael G.

    Purpose: To quantitatively and systematically assess dosimetric effects induced by spot positioning error as a function of spot spacing (SS) on intensity-modulated proton therapy (IMPT) plan quality and to facilitate evaluation of safety tolerance limits on spot position. Methods: Spot position errors (PE) ranging from 1 to 2 mm were simulated. Simple plans were created on a water phantom, and IMPT plans were calculated on two pediatric patients with a brain tumor of 28 and 3 cc, respectively, using a commercial planning system. For the phantom, a uniform dose was delivered to targets located at different depths from 10 tomore » 20 cm with various field sizes from 2{sup 2} to 15{sup 2} cm{sup 2}. Two nominal spot sizes, 4.0 and 6.6 mm of 1 σ in water at isocenter, were used for treatment planning. The SS ranged from 0.5 σ to 1.5 σ, which is 2–6 mm for the small spot size and 3.3–9.9 mm for the large spot size. Various perturbation scenarios of a single spot error and systematic and random multiple spot errors were studied. To quantify the dosimetric effects, percent dose error (PDE) depth profiles and the value of percent dose error at the maximum dose difference (PDE [ΔDmax]) were used for evaluation. Results: A pair of hot and cold spots was created per spot shift. PDE[ΔDmax] is found to be a complex function of PE, SS, spot size, depth, and global spot distribution that can be well defined in simple models. For volumetric targets, the PDE [ΔDmax] is not noticeably affected by the change of field size or target volume within the studied ranges. In general, reducing SS decreased the dose error. For the facility studied, given a single spot error with a PE of 1.2 mm and for both spot sizes, a SS of 1σ resulted in a 2% maximum dose error; a SS larger than 1.25 σ substantially increased the dose error and its sensitivity to PE. A similar trend was observed in multiple spot errors (both systematic and random errors). Systematic PE can lead to noticeable hot spots along the field edges, which may be near critical structures. However, random PE showed minimal dose error. Conclusions: Dose error dependence for PE was quantitatively and systematically characterized and an analytic tool was built to simulate systematic and random errors for patient-specific IMPT. This information facilitates the determination of facility specific spot position error thresholds.« less

  4. On the selection of gantry and collimator angles for isocenter localization using Winston-Lutz tests.

    PubMed

    Du, Weiliang; Johnson, Jennifer L; Jiang, Wei; Kudchadker, Rajat J

    2016-01-08

    In Winston-Lutz (WL) tests, the isocenter of a linear accelerator (linac) is determined as the intersection of radiation central axes (CAX) from multiple gantry, collimator, and couch angles. It is well known that the CAX can wobble due to mechanical imperfections of the linac. Previous studies suggested that the wobble varies with gantry and collimator angles. Therefore, the isocenter determined in the WL tests has a profound dependence on the gantry and collimator angles at which CAX are sampled. In this study, we evaluated the systematic and random errors in the iso-centers determined with different CAX sampling schemes. Digital WL tests were performed on six linacs. For each WL test, 63 CAX were sampled at nine gantry angles and seven collimator angles. Subsets of these data were used to simulate the effects of various CAX sampling schemes. An isocenter was calculated from each subset of CAX and compared against the reference isocenter, which was calculated from 48 opposing CAX. The differences between the calculated isocenters and the reference isocenters ranged from 0 to 0.8 mm. The differences diminished to less than 0.2 mm when 24 or more CAX were sampled. Isocenters determined with collimator 0° were vertically lower than those determined with collimator 90° and 270°. Isocenter localization errors in the longitudinal direction (along the axis of gantry rotation) showed a strong dependence on the collimator angle selected. The errors in all directions were significantly reduced when opposing collimator angles and opposing gantry angles were employed. The isocenter localization errors were less than 0.2 mm with the common CAX sampling scheme, which used four cardinal gantry angles and two opposing collimator angles. Reproducibility stud-ies on one linac showed that the mean and maximum variations of CAX during the WL tests were 0.053 mm and 0.30 mm, respectively. The maximal variation in the resulting isocenters was 0.068 mm if 48 CAX were used, or 0.13 mm if four CAX were used. Quantitative results from this study are useful for understanding and minimizing the isocenter uncertainty in WL tests.

  5. Stochastic sampled-data control for synchronization of complex dynamical networks with control packet loss and additive time-varying delays.

    PubMed

    Rakkiyappan, R; Sakthivel, N; Cao, Jinde

    2015-06-01

    This study examines the exponential synchronization of complex dynamical networks with control packet loss and additive time-varying delays. Additionally, sampled-data controller with time-varying sampling period is considered and is assumed to switch between m different values in a random way with given probability. Then, a novel Lyapunov-Krasovskii functional (LKF) with triple integral terms is constructed and by using Jensen's inequality and reciprocally convex approach, sufficient conditions under which the dynamical network is exponentially mean-square stable are derived. When applying Jensen's inequality to partition double integral terms in the derivation of linear matrix inequality (LMI) conditions, a new kind of linear combination of positive functions weighted by the inverses of squared convex parameters appears. In order to handle such a combination, an effective method is introduced by extending the lower bound lemma. To design the sampled-data controller, the synchronization error system is represented as a switched system. Based on the derived LMI conditions and average dwell-time method, sufficient conditions for the synchronization of switched error system are derived in terms of LMIs. Finally, numerical example is employed to show the effectiveness of the proposed methods. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Testing the Recognition and Perception of Errors in Context

    ERIC Educational Resources Information Center

    Brandenburg, Laura C.

    2015-01-01

    This study tests the recognition of errors in context and whether the presence of errors affects the reader's perception of the writer's ethos. In an experimental, posttest only design, participants were randomly assigned a memo to read in an online survey: one version with errors and one version without. Of the six intentional errors in version…

  7. Exploring Measurement Error with Cookies: A Real and Virtual Approach via Interactive Excel

    ERIC Educational Resources Information Center

    Sinex, Scott A; Gage, Barbara A.; Beck, Peggy J.

    2007-01-01

    A simple, guided-inquiry investigation using stacked sandwich cookies is employed to develop a simple linear mathematical model and to explore measurement error by incorporating errors as part of the investigation. Both random and systematic errors are presented. The model and errors are then investigated further by engaging with an interactive…

  8. Bayesian Methods for the Physical Sciences. Learning from Examples in Astronomy and Physics.

    NASA Astrophysics Data System (ADS)

    Andreon, Stefano; Weaver, Brian

    2015-05-01

    Chapter 1: This chapter presents some basic steps for performing a good statistical analysis, all summarized in about one page. Chapter 2: This short chapter introduces the basics of probability theory inan intuitive fashion using simple examples. It also illustrates, again with examples, how to propagate errors and the difference between marginal and profile likelihoods. Chapter 3: This chapter introduces the computational tools and methods that we use for sampling from the posterior distribution. Since all numerical computations, and Bayesian ones are no exception, may end in errors, we also provide a few tips to check that the numerical computation is sampling from the posterior distribution. Chapter 4: Many of the concepts of building, running, and summarizing the resultsof a Bayesian analysis are described with this step-by-step guide using a basic (Gaussian) model. The chapter also introduces examples using Poisson and Binomial likelihoods, and how to combine repeated independent measurements. Chapter 5: All statistical analyses make assumptions, and Bayesian analyses are no exception. This chapter emphasizes that results depend on data and priors (assumptions). We illustrate this concept with examples where the prior plays greatly different roles, from major to negligible. We also provide some advice on how to look for information useful for sculpting the prior. Chapter 6: In this chapter we consider examples for which we want to estimate more than a single parameter. These common problems include estimating location and spread. We also consider examples that require the modeling of two populations (one we are interested in and a nuisance population) or averaging incompatible measurements. We also introduce quite complex examples dealing with upper limits and with a larger-than-expected scatter. Chapter 7: Rarely is a sample randomly selected from the population we wish to study. Often, samples are affected by selection effects, e.g., easier-to-collect events or objects are over-represented in samples and difficult-to-collect are under-represented if not missing altogether. In this chapter we show how to account for non-random data collection to infer the properties of the population from the studied sample. Chapter 8: In this chapter we introduce regression models, i.e., how to fit (regress) one, or more quantities, against each other through a functional relationship and estimate any unknown parameters that dictate this relationship. Questions of interest include: how to deal with samples affected by selection effects? How does a rich data structure influence the fitted parameters? And what about non-linear multiple-predictor fits, upper/lower limits, measurements errors of different amplitudes and an intrinsic variety in the studied populations or an extra source of variability? A number of examples illustrate how to answer these questions and how to predict the value of an unavailable quantity by exploiting the existence of a trend with another, available, quantity. Chapter 9: This chapter provides some advice on how the careful scientist should perform model checking and sensitivity analysis, i.e., how to answer the following questions: is the considered model at odds with the current available data (the fitted data), for example because it is over-simplified compared to some specific complexity pointed out by the data? Furthermore, are the data informative about the quantity being measured or are results sensibly dependent on details of the fitted model? And, finally, what about if assumptions are uncertain? A number of examples illustrate how to answer these questions. Chapter 10: This chapter compares the performance of Bayesian methods against simple, non-Bayesian alternatives, such as maximum likelihood, minimal chi square, ordinary and weighted least square, bivariate correlated errors and intrinsic scatter, and robust estimates of location and scale. Performances are evaluated in terms of quality of the prediction, accuracy of the estimates, and fairness and noisiness of the quoted errors. We also focus on three failures of maximum likelihood methods occurring with small samples, with mixtures, and with regressions with errors in the predictor quantity.

  9. The systematic component of phylogenetic error as a function of taxonomic sampling under parsimony.

    PubMed

    Debry, Ronald W

    2005-06-01

    The effect of taxonomic sampling on phylogenetic accuracy under parsimony is examined by simulating nucleotide sequence evolution. Random error is minimized by using very large numbers of simulated characters. This allows estimation of the consistency behavior of parsimony, even for trees with up to 100 taxa. Data were simulated on 8 distinct 100-taxon model trees and analyzed as stratified subsets containing either 25 or 50 taxa, in addition to the full 100-taxon data set. Overall accuracy decreased in a majority of cases when taxa were added. However, the magnitude of change in the cases in which accuracy increased was larger than the magnitude of change in the cases in which accuracy decreased, so, on average, overall accuracy increased as more taxa were included. A stratified sampling scheme was used to assess accuracy for an initial subsample of 25 taxa. The 25-taxon analyses were compared to 50- and 100-taxon analyses that were pruned to include only the original 25 taxa. On average, accuracy for the 25 taxa was improved by taxon addition, but there was considerable variation in the degree of improvement among the model trees and across different rates of substitution.

  10. Robustly Aligning a Shape Model and Its Application to Car Alignment of Unknown Pose.

    PubMed

    Li, Yan; Gu, Leon; Kanade, Takeo

    2011-09-01

    Precisely localizing in an image a set of feature points that form a shape of an object, such as car or face, is called alignment. Previous shape alignment methods attempted to fit a whole shape model to the observed data, based on the assumption of Gaussian observation noise and the associated regularization process. However, such an approach, though able to deal with Gaussian noise in feature detection, turns out not to be robust or precise because it is vulnerable to gross feature detection errors or outliers resulting from partial occlusions or spurious features from the background or neighboring objects. We address this problem by adopting a randomized hypothesis-and-test approach. First, a Bayesian inference algorithm is developed to generate a shape-and-pose hypothesis of the object from a partial shape or a subset of feature points. For alignment, a large number of hypotheses are generated by randomly sampling subsets of feature points, and then evaluated to find the one that minimizes the shape prediction error. This method of randomized subset-based matching can effectively handle outliers and recover the correct object shape. We apply this approach on a challenging data set of over 5,000 different-posed car images, spanning a wide variety of car types, lighting, background scenes, and partial occlusions. Experimental results demonstrate favorable improvements over previous methods on both accuracy and robustness.

  11. Perceptions of Randomness: Why Three Heads Are Better than Four

    ERIC Educational Resources Information Center

    Hahn, Ulrike; Warren, Paul A.

    2009-01-01

    A long tradition of psychological research has lamented the systematic errors and biases in people's perception of the characteristics of sequences generated by a random mechanism such as a coin toss. It is proposed that once the likely nature of people's actual experience of such processes is taken into account, these "errors" and "biases"…

  12. Syzygies, Pluricanonical Maps, and the Birational Geometry of Varieties of Maximal Albanese Dimension

    NASA Astrophysics Data System (ADS)

    Tesfagiorgis, Kibrewossen B.

    Satellite Precipitation Estimates (SPEs) may be the only available source of information for operational hydrologic and flash flood prediction due to spatial limitations of radar and gauge products in mountainous regions. The present work develops an approach to seamlessly blend satellite, available radar, climatological and gauge precipitation products to fill gaps in ground-based radar precipitation field. To mix different precipitation products, the error of any of the products relative to each other should be removed. For bias correction, the study uses a new ensemble-based method which aims to estimate spatially varying multiplicative biases in SPEs using a radar-gauge precipitation product. Bias factors were calculated for a randomly selected sample of rainy pixels in the study area. Spatial fields of estimated bias were generated taking into account spatial variation and random errors in the sampled values. In addition to biases, sometimes there is also spatial error between the radar and satellite precipitation estimates; one of them has to be geometrically corrected with reference to the other. A set of corresponding raining points between SPE and radar products are selected to apply linear registration using a regularized least square technique to minimize the dislocation error in SPEs with respect to available radar products. A weighted Successive Correction Method (SCM) is used to make the merging between error corrected satellite and radar precipitation estimates. In addition to SCM, we use a combination of SCM and Bayesian spatial method for merging the rain gauges and climatological precipitation sources with radar and SPEs. We demonstrated the method using two satellite-based, CPC Morphing (CMORPH) and Hydro-Estimator (HE), two radar-gauge based, Stage-II and ST-IV, a climatological product PRISM and rain gauge dataset for several rain events from 2006 to 2008 over different geographical locations of the United States. Results show that: (a) the method of ensembles helped reduce biases in SPEs significantly; (b) the SCM method in combination with the Bayesian spatial model produced a precipitation product in good agreement with independent measurements .The study implies that using the available radar pixels surrounding the gap area, rain gauge, PRISM and satellite products, a radar like product is achievable over radar gap areas that benefits the operational meteorology and hydrology community.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Elliott, C.J.; McVey, B.; Quimby, D.C.

    The level of field errors in an FEL is an important determinant of its performance. We have computed 3D performance of a large laser subsystem subjected to field errors of various types. These calculations have been guided by simple models such as SWOOP. The technique of choice is utilization of the FELEX free electron laser code that now possesses extensive engineering capabilities. Modeling includes the ability to establish tolerances of various types: fast and slow scale field bowing, field error level, beam position monitor error level, gap errors, defocusing errors, energy slew, displacement and pointing errors. Many effects of thesemore » errors on relative gain and relative power extraction are displayed and are the essential elements of determining an error budget. The random errors also depend on the particular random number seed used in the calculation. The simultaneous display of the performance versus error level of cases with multiple seeds illustrates the variations attributable to stochasticity of this model. All these errors are evaluated numerically for comprehensive engineering of the system. In particular, gap errors are found to place requirements beyond mechanical tolerances of {plus minus}25{mu}m, and amelioration of these may occur by a procedure utilizing direct measurement of the magnetic fields at assembly time. 4 refs., 12 figs.« less

  14. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  15. The Influence of Training Phase on Error of Measurement in Jump Performance.

    PubMed

    Taylor, Kristie-Lee; Hopkins, Will G; Chapman, Dale W; Cronin, John B

    2016-03-01

    The purpose of this study was to calculate the coefficients of variation in jump performance for individual participants in multiple trials over time to determine the extent to which there are real differences in the error of measurement between participants. The effect of training phase on measurement error was also investigated. Six subjects participated in a resistance-training intervention for 12 wk with mean power from a countermovement jump measured 6 d/wk. Using a mixed-model meta-analysis, differences between subjects, within-subject changes between training phases, and the mean error values during different phases of training were examined. Small, substantial factor differences of 1.11 were observed between subjects; however, the finding was unclear based on the width of the confidence limits. The mean error was clearly higher during overload training than baseline training, by a factor of ×/÷ 1.3 (confidence limits 1.0-1.6). The random factor representing the interaction between subjects and training phases revealed further substantial differences of ×/÷ 1.2 (1.1-1.3), indicating that on average, the error of measurement in some subjects changes more than in others when overload training is introduced. The results from this study provide the first indication that within-subject variability in performance is substantially different between training phases and, possibly, different between individuals. The implications of these findings for monitoring individuals and estimating sample size are discussed.

  16. Statistical model for speckle pattern optimization.

    PubMed

    Su, Yong; Zhang, Qingchuan; Gao, Zeren

    2017-11-27

    Image registration is the key technique of optical metrologies such as digital image correlation (DIC), particle image velocimetry (PIV), and speckle metrology. Its performance depends critically on the quality of image pattern, and thus pattern optimization attracts extensive attention. In this article, a statistical model is built to optimize speckle patterns that are composed of randomly positioned speckles. It is found that the process of speckle pattern generation is essentially a filtered Poisson process. The dependence of measurement errors (including systematic errors, random errors, and overall errors) upon speckle pattern generation parameters is characterized analytically. By minimizing the errors, formulas of the optimal speckle radius are presented. Although the primary motivation is from the field of DIC, we believed that scholars in other optical measurement communities, such as PIV and speckle metrology, will benefit from these discussions.

  17. A theoretical basis for the analysis of redundant software subject to coincident errors

    NASA Technical Reports Server (NTRS)

    Eckhardt, D. E., Jr.; Lee, L. D.

    1985-01-01

    Fundamental to the development of redundant software techniques fault-tolerant software, is an understanding of the impact of multiple-joint occurrences of coincident errors. A theoretical basis for the study of redundant software is developed which provides a probabilistic framework for empirically evaluating the effectiveness of the general (N-Version) strategy when component versions are subject to coincident errors, and permits an analytical study of the effects of these errors. The basic assumptions of the model are: (1) independently designed software components are chosen in a random sample; and (2) in the user environment, the system is required to execute on a stationary input series. The intensity of coincident errors, has a central role in the model. This function describes the propensity to introduce design faults in such a way that software components fail together when executing in the user environment. The model is used to give conditions under which an N-Version system is a better strategy for reducing system failure probability than relying on a single version of software. A condition which limits the effectiveness of a fault-tolerant strategy is studied, and it is posted whether system failure probability varies monotonically with increasing N or whether an optimal choice of N exists.

  18. Asymmetric Memory Circuit Would Resist Soft Errors

    NASA Technical Reports Server (NTRS)

    Buehler, Martin G.; Perlman, Marvin

    1990-01-01

    Some nonlinear error-correcting codes more efficient in presence of asymmetry. Combination of circuit-design and coding concepts expected to make integrated-circuit random-access memories more resistant to "soft" errors (temporary bit errors, also called "single-event upsets" due to ionizing radiation). Integrated circuit of new type made deliberately more susceptible to one kind of bit error than to other, and associated error-correcting code adapted to exploit this asymmetry in error probabilities.

  19. Complementary nonparametric analysis of covariance for logistic regression in a randomized clinical trial setting.

    PubMed

    Tangen, C M; Koch, G G

    1999-03-01

    In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.

  20. Theoretical analysis on the measurement errors of local 2D DIC: Part I temporal and spatial uncertainty quantification of displacement measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Yueqi; Lava, Pascal; Reu, Phillip

    This study presents a theoretical uncertainty quantification of displacement measurements by subset-based 2D-digital image correlation. A generalized solution to estimate the random error of displacement measurement is presented. The obtained solution suggests that the random error of displacement measurements is determined by the image noise, the summation of the intensity gradient in a subset, the subpixel part of displacement, and the interpolation scheme. The proposed method is validated with virtual digital image correlation tests.

  1. Theoretical analysis on the measurement errors of local 2D DIC: Part I temporal and spatial uncertainty quantification of displacement measurements

    DOE PAGES

    Wang, Yueqi; Lava, Pascal; Reu, Phillip; ...

    2015-12-23

    This study presents a theoretical uncertainty quantification of displacement measurements by subset-based 2D-digital image correlation. A generalized solution to estimate the random error of displacement measurement is presented. The obtained solution suggests that the random error of displacement measurements is determined by the image noise, the summation of the intensity gradient in a subset, the subpixel part of displacement, and the interpolation scheme. The proposed method is validated with virtual digital image correlation tests.

  2. Performance of statistical models to predict mental health and substance abuse cost.

    PubMed

    Montez-Rath, Maria; Christiansen, Cindy L; Ettner, Susan L; Loveland, Susan; Rosen, Amy K

    2006-10-26

    Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.

  3. The relationships among work stress, strain and self-reported errors in UK community pharmacy.

    PubMed

    Johnson, S J; O'Connor, E M; Jacobs, S; Hassell, K; Ashcroft, D M

    2014-01-01

    Changes in the UK community pharmacy profession including new contractual frameworks, expansion of services, and increasing levels of workload have prompted concerns about rising levels of workplace stress and overload. This has implications for pharmacist health and well-being and the occurrence of errors that pose a risk to patient safety. Despite these concerns being voiced in the profession, few studies have explored work stress in the community pharmacy context. To investigate work-related stress among UK community pharmacists and to explore its relationships with pharmacists' psychological and physical well-being, and the occurrence of self-reported dispensing errors and detection of prescribing errors. A cross-sectional postal survey of a random sample of practicing community pharmacists (n = 903) used ASSET (A Shortened Stress Evaluation Tool) and questions relating to self-reported involvement in errors. Stress data were compared to general working population norms, and regressed on well-being and self-reported errors. Analysis of the data revealed that pharmacists reported significantly higher levels of workplace stressors than the general working population, with concerns about work-life balance, the nature of the job, and work relationships being the most influential on health and well-being. Despite this, pharmacists were not found to report worse health than the general working population. Self-reported error involvement was linked to both high dispensing volume and being troubled by perceived overload (dispensing errors), and resources and communication (detection of prescribing errors). This study contributes to the literature by benchmarking community pharmacists' health and well-being, and investigating sources of stress using a quantitative approach. A further important contribution to the literature is the identification of a quantitative link between high workload and self-reported dispensing errors. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Prevalence of refractive error and visual impairment among rural school-age children of Goro District, Gurage Zone, Ethiopia.

    PubMed

    Kedir, Jafer; Girma, Abonesh

    2014-10-01

    Refractive error is one of the major causes of blindness and visual impairment in children; but community based studies are scarce especially in rural parts of Ethiopia. So, this study aims to assess the prevalence of refractive error and its magnitude as a cause of visual impairment among school-age children of rural community. This community-based cross-sectional descriptive study was conducted from March 1 to April 30, 2009 in rural villages of Goro district of Gurage Zone, found south west of Addis Ababa, the capital of Ethiopia. A multistage cluster sampling method was used with simple random selection of representative villages in the district. Chi-Square and t-tests were used in the data analysis. A total of 570 school-age children (age 7-15) were evaluated, 54% boys and 46% girls. The prevalence of refractive error was 3.5% (myopia 2.6% and hyperopia 0.9%). Refractive error was the major cause of visual impairment accounting for 54% of all causes in the study group. No child was found wearing corrective spectacles during the study period. Refractive error was the commonest cause of visual impairment in children of the district, but no measures were taken to reduce the burden in the community. So, large scale community level screening for refractive error should be conducted and integrated with regular school eye screening programs. Effective strategies need to be devised to provide low cost corrective spectacles in the rural community.

  5. Violation of the Sphericity Assumption and Its Effect on Type-I Error Rates in Repeated Measures ANOVA and Multi-Level Linear Models (MLM).

    PubMed

    Haverkamp, Nicolas; Beauducel, André

    2017-01-01

    We investigated the effects of violations of the sphericity assumption on Type I error rates for different methodical approaches of repeated measures analysis using a simulation approach. In contrast to previous simulation studies on this topic, up to nine measurement occasions were considered. Effects of the level of inter-correlations between measurement occasions on Type I error rates were considered for the first time. Two populations with non-violation of the sphericity assumption, one with uncorrelated measurement occasions and one with moderately correlated measurement occasions, were generated. One population with violation of the sphericity assumption combines uncorrelated with highly correlated measurement occasions. A second population with violation of the sphericity assumption combines moderately correlated and highly correlated measurement occasions. From these four populations without any between-group effect or within-subject effect 5,000 random samples were drawn. Finally, the mean Type I error rates for Multilevel linear models (MLM) with an unstructured covariance matrix (MLM-UN), MLM with compound-symmetry (MLM-CS) and for repeated measures analysis of variance (rANOVA) models (without correction, with Greenhouse-Geisser-correction, and Huynh-Feldt-correction) were computed. To examine the effect of both the sample size and the number of measurement occasions, sample sizes of n = 20, 40, 60, 80, and 100 were considered as well as measurement occasions of m = 3, 6, and 9. With respect to rANOVA, the results plead for a use of rANOVA with Huynh-Feldt-correction, especially when the sphericity assumption is violated, the sample size is rather small and the number of measurement occasions is large. For MLM-UN, the results illustrate a massive progressive bias for small sample sizes ( n = 20) and m = 6 or more measurement occasions. This effect could not be found in previous simulation studies with a smaller number of measurement occasions. The proportionality of bias and number of measurement occasions should be considered when MLM-UN is used. The good news is that this proportionality can be compensated by means of large sample sizes. Accordingly, MLM-UN can be recommended even for small sample sizes for about three measurement occasions and for large sample sizes for about nine measurement occasions.

  6. Error Analysis of Indirect Broadband Monitoring of Multilayer Optical Coatings using Computer Simulations

    NASA Astrophysics Data System (ADS)

    Semenov, Z. V.; Labusov, V. A.

    2017-11-01

    Results of studying the errors of indirect monitoring by means of computer simulations are reported. The monitoring method is based on measuring spectra of reflection from additional monitoring substrates in a wide spectral range. Special software (Deposition Control Simulator) is developed, which allows one to estimate the influence of the monitoring system parameters (noise of the photodetector array, operating spectral range of the spectrometer and errors of its calibration in terms of wavelengths, drift of the radiation source intensity, and errors in the refractive index of deposited materials) on the random and systematic errors of deposited layer thickness measurements. The direct and inverse problems of multilayer coatings are solved using the OptiReOpt library. Curves of the random and systematic errors of measurements of the deposited layer thickness as functions of the layer thickness are presented for various values of the system parameters. Recommendations are given on using the indirect monitoring method for the purpose of reducing the layer thickness measurement error.

  7. Error analysis and algorithm implementation for an improved optical-electric tracking device based on MEMS

    NASA Astrophysics Data System (ADS)

    Sun, Hong; Wu, Qian-zhong

    2013-09-01

    In order to improve the precision of optical-electric tracking device, proposing a kind of improved optical-electric tracking device based on MEMS, in allusion to the tracking error of gyroscope senor and the random drift, According to the principles of time series analysis of random sequence, establish AR model of gyro random error based on Kalman filter algorithm, then the output signals of gyro are multiple filtered with Kalman filter. And use ARM as micro controller servo motor is controlled by fuzzy PID full closed loop control algorithm, and add advanced correction and feed-forward links to improve response lag of angle input, Free-forward can make output perfectly follow input. The function of lead compensation link is to shorten the response of input signals, so as to reduce errors. Use the wireless video monitor module and remote monitoring software (Visual Basic 6.0) to monitor servo motor state in real time, the video monitor module gathers video signals, and the wireless video module will sent these signals to upper computer, so that show the motor running state in the window of Visual Basic 6.0. At the same time, take a detailed analysis to the main error source. Through the quantitative analysis of the errors from bandwidth and gyro sensor, it makes the proportion of each error in the whole error more intuitive, consequently, decrease the error of the system. Through the simulation and experiment results shows the system has good following characteristic, and it is very valuable for engineering application.

  8. Sampling design for groundwater solute transport: Tests of methods and analysis of Cape Cod tracer test data

    USGS Publications Warehouse

    Knopman, Debra S.; Voss, Clifford I.; Garabedian, Stephen P.

    1991-01-01

    Tests of a one-dimensional sampling design methodology on measurements of bromide concentration collected during the natural gradient tracer test conducted by the U.S. Geological Survey on Cape Cod, Massachusetts, demonstrate its efficacy for field studies of solute transport in groundwater and the utility of one-dimensional analysis. The methodology was applied to design of sparse two-dimensional networks of fully screened wells typical of those often used in engineering practice. In one-dimensional analysis, designs consist of the downstream distances to rows of wells oriented perpendicular to the groundwater flow direction and the timing of sampling to be carried out on each row. The power of a sampling design is measured by its effectiveness in simultaneously meeting objectives of model discrimination, parameter estimation, and cost minimization. One-dimensional models of solute transport, differing in processes affecting the solute and assumptions about the structure of the flow field, were considered for description of tracer cloud migration. When fitting each model using nonlinear regression, additive and multiplicative error forms were allowed for the residuals which consist of both random and model errors. The one-dimensional single-layer model of a nonreactive solute with multiplicative error was judged to be the best of those tested. Results show the efficacy of the methodology in designing sparse but powerful sampling networks. Designs that sample five rows of wells at five or fewer times in any given row performed as well for model discrimination as the full set of samples taken up to eight times in a given row from as many as 89 rows. Also, designs for parameter estimation judged to be good by the methodology were as effective in reducing the variance of parameter estimates as arbitrary designs with many more samples. Results further showed that estimates of velocity and longitudinal dispersivity in one-dimensional models based on data from only five rows of fully screened wells each sampled five or fewer times were practically equivalent to values determined from moments analysis of the complete three-dimensional set of 29,285 samples taken during 16 sampling times.

  9. False-positive rate determination of protein target discovery using a covalent modification- and mass spectrometry-based proteomics platform.

    PubMed

    Strickland, Erin C; Geer, M Ariel; Hong, Jiyong; Fitzgerald, Michael C

    2014-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. Stability of proteins from rates of oxidation (SPROX) is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false-positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False-positive rates of 1.2-2.2% and <0.8% are calculated for SPROX experiments using Q-TOF and Orbitrap mass spectrometer systems, respectively. Our results indicate that the false-positive rate is largely determined by random errors associated with the mass spectral analysis of the isobaric mass tag (e.g., iTRAQ®) reporter ions used for peptide quantitation. Our results also suggest that technical replicates can be used to effectively eliminate such false positives that result from this random error, as is demonstrated in a SPROX experiment to identify yeast protein targets of the drug, manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false-positive rate of protein target discovery using SPROX is also discussed.

  10. Error Sources in Asteroid Astrometry

    NASA Technical Reports Server (NTRS)

    Owen, William M., Jr.

    2000-01-01

    Asteroid astrometry, like any other scientific measurement process, is subject to both random and systematic errors, not all of which are under the observer's control. To design an astrometric observing program or to improve an existing one requires knowledge of the various sources of error, how different errors affect one's results, and how various errors may be minimized by careful observation or data reduction techniques.

  11. Behavior of sensitivities in the one-dimensional advection-dispersion equation: Implications for parameter estimation and sampling design

    USGS Publications Warehouse

    Knopman, Debra S.; Voss, Clifford I.

    1987-01-01

    The spatial and temporal variability of sensitivities has a significant impact on parameter estimation and sampling design for studies of solute transport in porous media. Physical insight into the behavior of sensitivities is offered through an analysis of analytically derived sensitivities for the one-dimensional form of the advection-dispersion equation. When parameters are estimated in regression models of one-dimensional transport, the spatial and temporal variability in sensitivities influences variance and covariance of parameter estimates. Several principles account for the observed influence of sensitivities on parameter uncertainty. (1) Information about a physical parameter may be most accurately gained at points in space and time with a high sensitivity to the parameter. (2) As the distance of observation points from the upstream boundary increases, maximum sensitivity to velocity during passage of the solute front increases and the consequent estimate of velocity tends to have lower variance. (3) The frequency of sampling must be “in phase” with the S shape of the dispersion sensitivity curve to yield the most information on dispersion. (4) The sensitivity to the dispersion coefficient is usually at least an order of magnitude less than the sensitivity to velocity. (5) The assumed probability distribution of random error in observations of solute concentration determines the form of the sensitivities. (6) If variance in random error in observations is large, trends in sensitivities of observation points may be obscured by noise and thus have limited value in predicting variance in parameter estimates among designs. (7) Designs that minimize the variance of one parameter may not necessarily minimize the variance of other parameters. (8) The time and space interval over which an observation point is sensitive to a given parameter depends on the actual values of the parameters in the underlying physical system.

  12. Correcting the Standard Errors of 2-Stage Residual Inclusion Estimators for Mendelian Randomization Studies.

    PubMed

    Palmer, Tom M; Holmes, Michael V; Keating, Brendan J; Sheehan, Nuala A

    2017-11-01

    Mendelian randomization studies use genotypes as instrumental variables to test for and estimate the causal effects of modifiable risk factors on outcomes. Two-stage residual inclusion (TSRI) estimators have been used when researchers are willing to make parametric assumptions. However, researchers are currently reporting uncorrected or heteroscedasticity-robust standard errors for these estimates. We compared several different forms of the standard error for linear and logistic TSRI estimates in simulations and in real-data examples. Among others, we consider standard errors modified from the approach of Newey (1987), Terza (2016), and bootstrapping. In our simulations Newey, Terza, bootstrap, and corrected 2-stage least squares (in the linear case) standard errors gave the best results in terms of coverage and type I error. In the real-data examples, the Newey standard errors were 0.5% and 2% larger than the unadjusted standard errors for the linear and logistic TSRI estimators, respectively. We show that TSRI estimators with modified standard errors have correct type I error under the null. Researchers should report TSRI estimates with modified standard errors instead of reporting unadjusted or heteroscedasticity-robust standard errors. © The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.

  13. Chemistry and haematology sample rejection and clinical impact in a tertiary laboratory in Cape Town.

    PubMed

    Jacobsz, Lourens A; Zemlin, Annalise E; Roos, Mark J; Erasmus, Rajiv T

    2011-10-14

    Recent publications report that up to 70% of total laboratory errors occur in the pre-analytical phase. Identification of specific problems highlights pre-analytic processes susceptible to errors. The rejection of unsuitable samples can lead to delayed turnaround time and affect patient care. A retrospective audit was conducted investigating the rejection rate of routine blood specimens received at chemistry and haematology laboratories over a 2-week period. The reasons for rejection and potential clinical impact of these rejections were investigated. Thirty patient files were randomly selected and examined to assess the impact of these rejections on clinical care. A total of 32,910 specimens were received during the study period, of which 481 were rejected, giving a rejection rate of 1.46%. The main reasons for rejection were inappropriate clotting (30%) and inadequate sample volume (22%). Only 51.7% of rejected samples were repeated and the average time for a repeat sample to reach the laboratory was about 5 days (121 h). Of the repeated samples, 5.1% had results within critical values. Examination of patient folders showed that in 40% of cases the rejection of samples had an impact on patient care. The evaluation of pre-analytical processes in the laboratory, with regard to sample rejection, allowed one to identify problem areas where improvement is necessary. Rejected samples due to factors out of the laboratory's control had a definite impact on patient care and can thus affect customer satisfaction. Clinicians should be aware of these factors to prevent such rejections.

  14. Impact of Exposure Uncertainty on the Association between Perfluorooctanoate and Preeclampsia in the C8 Health Project Population.

    PubMed

    Avanasi, Raghavendhran; Shin, Hyeong-Moo; Vieira, Verónica M; Savitz, David A; Bartell, Scott M

    2016-01-01

    Uncertainty in exposure estimates from models can result in exposure measurement error and can potentially affect the validity of epidemiological studies. We recently used a suite of environmental models and an integrated exposure and pharmacokinetic model to estimate individual perfluorooctanoate (PFOA) serum concentrations and assess the association with preeclampsia from 1990 through 2006 for the C8 Health Project participants. The aims of the current study are to evaluate impact of uncertainty in estimated PFOA drinking-water concentrations on estimated serum concentrations and their reported epidemiological association with preeclampsia. For each individual public water district, we used Monte Carlo simulations to vary the year-by-year PFOA drinking-water concentration by randomly sampling from lognormal distributions for random error in the yearly public water district PFOA concentrations, systematic error specific to each water district, and global systematic error in the release assessment (using the estimated concentrations from the original fate and transport model as medians and a range of 2-, 5-, and 10-fold uncertainty). Uncertainty in PFOA water concentrations could cause major changes in estimated serum PFOA concentrations among participants. However, there is relatively little impact on the resulting epidemiological association in our simulations. The contribution of exposure uncertainty to the total uncertainty (including regression parameter variance) ranged from 5% to 31%, and bias was negligible. We found that correlated exposure uncertainty can substantially change estimated PFOA serum concentrations, but results in only minor impacts on the epidemiological association between PFOA and preeclampsia. Avanasi R, Shin HM, Vieira VM, Savitz DA, Bartell SM. 2016. Impact of exposure uncertainty on the association between perfluorooctanoate and preeclampsia in the C8 Health Project population. Environ Health Perspect 124:126-132; http://dx.doi.org/10.1289/ehp.1409044.

  15. How does Socio-Economic Factors Influence Interest to Go to Vocational High Schools?

    NASA Astrophysics Data System (ADS)

    Utomo, N. F.; Wonggo, D.

    2018-02-01

    This study is aimed to reveal the interest of the students of junior high schools in Sangihe Islands, Indonesia, to go to vocational high schools and the affecting factors. This study used the quantitative method with the ex-post facto approach. The population consisted of 332 students, and the sample of 178 students was established using the proportional random sampling technique applying Isaac table’s 5% error standard. The results show that family’s socio-economic condition positively contributes 26% to interest to go to vocational high schools thus proving that family’s socio-economic condition is influential and contribute to junior high school students’ interest to go to vocational high schools.

  16. No brain expansion in Australopithecus boisei.

    PubMed

    Hawks, John

    2011-10-01

    The endocranial volumes of robust australopithecine fossils appear to have increased in size over time. Most evidence with temporal resolution is concentrated in East African Australopithecus boisei. Including the KNM-WT 17000 cranium, this sample comprises 11 endocranial volume estimates ranging in date from 2.5 million to 1.4 million years ago. But the sample presents several difficulties to a test of trend, including substantial estimation error for some specimens and an unusually low variance. This study reevaluates the evidence, using randomization methods and a related test using an explicit model of variability. None of these tests applied to the A. boisei endocranial volume sample produces significant evidence for a trend in that species, whether or not the early KNM-WT 17000 specimen is included. Copyright © 2011 Wiley-Liss, Inc.

  17. An audit of the nature and impact of clinical coding subjectivity variability and error in otolaryngology.

    PubMed

    Nouraei, S A R; Hudovsky, A; Virk, J S; Chatrath, P; Sandhu, G S

    2013-12-01

    To audit the accuracy of clinical coding in otolaryngology, assess the effectiveness of previously implemented interventions, and determine ways in which it can be further improved. Prospective clinician-auditor multidisciplinary audit of clinical coding accuracy. Elective and emergency ENT admissions and day-case activity. Concordance between initial coding and the clinician-auditor multi-disciplinary teams (MDT) coding in respect of primary and secondary diagnoses and procedures, health resource groupings health resource groupings (HRGs) and tariffs. The audit of 3131 randomly selected otolaryngology patients between 2010 and 2012 resulted in 420 instances of change to the primary diagnosis (13%) and 417 changes to the primary procedure (13%). In 1420 cases (44%), there was at least one change to the initial coding and 514 (16%) health resource groupings changed. There was an income variance of £343,169 or £109.46 per patient. The highest rates of health resource groupings change were observed in head and neck surgery and in particular skull-based surgery, laryngology and within that tracheostomy, and emergency admissions, and specially, epistaxis management. A randomly selected sample of 235 patients from the audit were subjected to a second audit by a second clinician-auditor multi-disciplinary team. There were 12 further health resource groupings changes (5%) and at least one further coding change occurred in 57 patients (24%). These changes were significantly lower than those observed in the pre-audit sample, but were also significantly greater than zero. Asking surgeons to 'code in theatre' and applying these codes without further quality assurance to activity resulted in an health resource groupings error rate of 45%. The full audit sample was regrouped under health resource groupings 3.5 and was compared with a previous audit of 1250 patients performed between 2007 and 2008. This comparison showed a reduction in the baseline rate of health resource groupings change from 16% during the first audit cycle to 9% in the current audit cycle (P < 0.001). Otolaryngology coding is complex and susceptible to subjectivity, variability and error. Coding variability can be improved, but not eliminated through regular education supported by an audit programme. © 2013 John Wiley & Sons Ltd.

  18. Enhanced orbit determination filter sensitivity analysis: Error budget development

    NASA Technical Reports Server (NTRS)

    Estefan, J. A.; Burkhart, P. D.

    1994-01-01

    An error budget analysis is presented which quantifies the effects of different error sources in the orbit determination process when the enhanced orbit determination filter, recently developed, is used to reduce radio metric data. The enhanced filter strategy differs from more traditional filtering methods in that nearly all of the principal ground system calibration errors affecting the data are represented as filter parameters. Error budget computations were performed for a Mars Observer interplanetary cruise scenario for cases in which only X-band (8.4-GHz) Doppler data were used to determine the spacecraft's orbit, X-band ranging data were used exclusively, and a combined set in which the ranging data were used in addition to the Doppler data. In all three cases, the filter model was assumed to be a correct representation of the physical world. Random nongravitational accelerations were found to be the largest source of error contributing to the individual error budgets. Other significant contributors, depending on the data strategy used, were solar-radiation pressure coefficient uncertainty, random earth-orientation calibration errors, and Deep Space Network (DSN) station location uncertainty.

  19. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale. PMID:26964095

  20. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    PubMed

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale.

  1. Application and testing of a procedure to evaluate transferability of habitat suitability criteria

    USGS Publications Warehouse

    Thomas, Jeff A.; Bovee, Ken D.

    1993-01-01

    A procedure designed to test the transferability of habitat suitability criteria was evaluated in the Cache la Poudre River, Colorado. Habitat suitability criteria were developed for active adult and juvenile rainbow trout in the South Platte River, Colorado. These criteria were tested by comparing microhabitat use predicted from the criteria with observed microhabitat use by adult rainbow trout in the Cache la Poudre River. A one-sided X2 test, using counts of occupied and unoccupied cells in each suitability classification, was used to test for non-random selection for optimum habitat use over usable habitat and for suitable over unsuitable habitat. Criteria for adult rainbow trout were judged to be transferable to the Cache la Poudre River, but juvenile criteria (applied to adults) were not transferable. Random subsampling of occupied and unoccupied cells was conducted to determine the effect of sample size on the reliability of the test procedure. The incidence of type I and type II errors increased rapidly as the sample size was reduced below 55 occupied and 200 unoccupied cells. Recommended modifications to the procedure included the adoption of a systematic or randomized sampling design and direct measurement of microhabitat variables. With these modifications, the procedure is economical, simple and reliable. Use of the procedure as a quality assurance device in routine applications of the instream flow incremental methodology was encouraged.

  2. Sample size estimation for alternating logistic regressions analysis of multilevel randomized community trials of under-age drinking.

    PubMed

    Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark

    2012-07-01

    Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure.

  3. A new statistic to express the uncertainty of kriging predictions for purposes of survey planning.

    NASA Astrophysics Data System (ADS)

    Lark, R. M.; Lapworth, D. J.

    2014-05-01

    It is well-known that one advantage of kriging for spatial prediction is that, given the random effects model, the prediction error variance can be computed a priori for alternative sampling designs. This allows one to compare sampling schemes, in particular sampling at different densities, and so to decide on one which meets requirements in terms of the uncertainty of the resulting predictions. However, the planning of sampling schemes must account not only for statistical considerations, but also logistics and cost. This requires effective communication between statisticians, soil scientists and data users/sponsors such as managers, regulators or civil servants. In our experience the latter parties are not necessarily able to interpret the prediction error variance as a measure of uncertainty for decision making. In some contexts (particularly the solution of very specific problems at large cartographic scales, e.g. site remediation and precision farming) it is possible to translate uncertainty of predictions into a loss function directly comparable with the cost incurred in increasing precision. Often, however, sampling must be planned for more generic purposes (e.g. baseline or exploratory geochemical surveys). In this latter context the prediction error variance may be of limited value to a non-statistician who has to make a decision on sample intensity and associated cost. We propose an alternative criterion for these circumstances to aid communication between statisticians and data users about the uncertainty of geostatistical surveys based on different sampling intensities. The criterion is the consistency of estimates made from two non-coincident instantiations of a proposed sample design. We consider square sample grids, one instantiation is offset from the second by half the grid spacing along the rows and along the columns. If a sample grid is coarse relative to the important scales of variation in the target property then the consistency of predictions from two instantiations is expected to be small, and can be increased by reducing the grid spacing. The measure of consistency is the correlation between estimates from the two instantiations of the sample grid, averaged over a grid cell. We call this the offset correlation, it can be calculated from the variogram. We propose that this measure is easier to grasp intuitively than the prediction error variance, and has the advantage of having an upper bound (1.0) which will aid its interpretation. This quality measure is illustrated for some hypothetical examples, considering both ordinary kriging and factorial kriging of the variable of interest. It is also illustrated using data on metal concentrations in the soil of north-east England.

  4. Does Mckuer's Law Hold for Heart Rate Control via Biofeedback Display?

    NASA Technical Reports Server (NTRS)

    Courter, B. J.; Jex, H. R.

    1984-01-01

    Some persons can control their pulse rate with the aid of a biofeedback display. If the biofeedback display is modified to show the error between a command pulse-rate and the measured rate, a compensatory (error correcting) heart rate tracking control loop can be created. The dynamic response characteristics of this control loop when subjected to step and quasi-random disturbances were measured. The control loop includes a beat-to-beat cardiotachmeter differenced with a forcing function from a quasi-random input generator; the resulting error pulse-rate is displayed as feedback. The subject acts to null the displayed pulse-rate error, thereby closing a compensatory control loop. McRuer's Law should hold for this case. A few subjects already skilled in voluntary pulse-rate control were tested for heart-rate control response. Control-law properties are derived, such as: crossover frequency, stability margins, and closed-loop bandwidth. These are evaluated for a range of forcing functions and for step as well as random disturbances.

  5. Synthesis of hover autopilots for rotary-wing VTOL aircraft

    NASA Technical Reports Server (NTRS)

    Hall, W. E.; Bryson, A. E., Jr.

    1972-01-01

    The practical situation is considered where imperfect information on only a few rotor and fuselage state variables is available. Filters are designed to estimate all the state variables from noisy measurements of fuselage pitch/roll angles and from noisy measurements of both fuselage and rotor pitch/roll angles. The mean square response of the vehicle to a very gusty, random wind is computed using various filter/controllers and is found to be quite satisfactory although, of course, not so good as when one has perfect information (idealized case). The second part of the report considers precision hover over a point on the ground. A vehicle model without rotor dynamics is used and feedback signals in position and integral of position error are added. The mean square response of the vehicle to a very gusty, random wind is computed, assuming perfect information feedback, and is found to be excellent. The integral error feedback gives zero position error for a steady wind, and smaller position error for a random wind.

  6. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  7. Selecting Statistical Quality Control Procedures for Limiting the Impact of Increases in Analytical Random Error on Patient Safety.

    PubMed

    Yago, Martín

    2017-05-01

    QC planning based on risk management concepts can reduce the probability of harming patients due to an undetected out-of-control error condition. It does this by selecting appropriate QC procedures to decrease the number of erroneous results reported. The selection can be easily made by using published nomograms for simple QC rules when the out-of-control condition results in increased systematic error. However, increases in random error also occur frequently and are difficult to detect, which can result in erroneously reported patient results. A statistical model was used to construct charts for the 1 ks and X /χ 2 rules. The charts relate the increase in the number of unacceptable patient results reported due to an increase in random error with the capability of the measurement procedure. They thus allow for QC planning based on the risk of patient harm due to the reporting of erroneous results. 1 ks Rules are simple, all-around rules. Their ability to deal with increases in within-run imprecision is minimally affected by the possible presence of significant, stable, between-run imprecision. X /χ 2 rules perform better when the number of controls analyzed during each QC event is increased to improve QC performance. Using nomograms simplifies the selection of statistical QC procedures to limit the number of erroneous patient results reported due to an increase in analytical random error. The selection largely depends on the presence or absence of stable between-run imprecision. © 2017 American Association for Clinical Chemistry.

  8. Meta-analysis in evidence-based healthcare: a paradigm shift away from random effects is overdue.

    PubMed

    Doi, Suhail A R; Furuya-Kanamori, Luis; Thalib, Lukman; Barendregt, Jan J

    2017-12-01

    Each year up to 20 000 systematic reviews and meta-analyses are published whose results influence healthcare decisions, thus making the robustness and reliability of meta-analytic methods one of the world's top clinical and public health priorities. The evidence synthesis makes use of either fixed-effect or random-effects statistical methods. The fixed-effect method has largely been replaced by the random-effects method as heterogeneity of study effects led to poor error estimation. However, despite the widespread use and acceptance of the random-effects method to correct this, it too remains unsatisfactory and continues to suffer from defective error estimation, posing a serious threat to decision-making in evidence-based clinical and public health practice. We discuss here the problem with the random-effects approach and demonstrate that there exist better estimators under the fixed-effect model framework that can achieve optimal error estimation. We argue for an urgent return to the earlier framework with updates that address these problems and conclude that doing so can markedly improve the reliability of meta-analytical findings and thus decision-making in healthcare.

  9. Secondary outcome analysis for data from an outcome-dependent sampling design.

    PubMed

    Pan, Yinghao; Cai, Jianwen; Longnecker, Matthew P; Zhou, Haibo

    2018-04-22

    Outcome-dependent sampling (ODS) scheme is a cost-effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second- and higher-order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method. Copyright © 2018 John Wiley & Sons, Ltd.

  10. A correction method for systematic error in (1)H-NMR time-course data validated through stochastic cell culture simulation.

    PubMed

    Sokolenko, Stanislav; Aucoin, Marc G

    2015-09-04

    The growing ubiquity of metabolomic techniques has facilitated high frequency time-course data collection for an increasing number of applications. While the concentration trends of individual metabolites can be modeled with common curve fitting techniques, a more accurate representation of the data needs to consider effects that act on more than one metabolite in a given sample. To this end, we present a simple algorithm that uses nonparametric smoothing carried out on all observed metabolites at once to identify and correct systematic error from dilution effects. In addition, we develop a simulation of metabolite concentration time-course trends to supplement available data and explore algorithm performance. Although we focus on nuclear magnetic resonance (NMR) analysis in the context of cell culture, a number of possible extensions are discussed. Realistic metabolic data was successfully simulated using a 4-step process. Starting with a set of metabolite concentration time-courses from a metabolomic experiment, each time-course was classified as either increasing, decreasing, concave, or approximately constant. Trend shapes were simulated from generic functions corresponding to each classification. The resulting shapes were then scaled to simulated compound concentrations. Finally, the scaled trends were perturbed using a combination of random and systematic errors. To detect systematic errors, a nonparametric fit was applied to each trend and percent deviations calculated at every timepoint. Systematic errors could be identified at time-points where the median percent deviation exceeded a threshold value, determined by the choice of smoothing model and the number of observed trends. Regardless of model, increasing the number of observations over a time-course resulted in more accurate error estimates, although the improvement was not particularly large between 10 and 20 samples per trend. The presented algorithm was able to identify systematic errors as small as 2.5 % under a wide range of conditions. Both the simulation framework and error correction method represent examples of time-course analysis that can be applied to further developments in (1)H-NMR methodology and the more general application of quantitative metabolomics.

  11. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap.

    PubMed

    Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E

    2016-06-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.

  12. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap

    PubMed Central

    Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.

    2017-01-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161

  13. Evaluating data mining algorithms using molecular dynamics trajectories.

    PubMed

    Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis

    2013-01-01

    Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.

  14. RCT: Module 2.03, Counting Errors and Statistics, Course 8768

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hillmer, Kurt T.

    2017-04-01

    Radiological sample analysis involves the observation of a random process that may or may not occur and an estimation of the amount of radioactive material present based on that observation. Across the country, radiological control personnel are using the activity measurements to make decisions that may affect the health and safety of workers at those facilities and their surrounding environments. This course will present an overview of measurement processes, a statistical evaluation of both measurements and equipment performance, and some actions to take to minimize the sources of error in count room operations. This course will prepare the student withmore » the skills necessary for radiological control technician (RCT) qualification by passing quizzes, tests, and the RCT Comprehensive Phase 1, Unit 2 Examination (TEST 27566) and by providing in the field skills.« less

  15. ON NONSTATIONARY STOCHASTIC MODELS FOR EARTHQUAKES.

    USGS Publications Warehouse

    Safak, Erdal; Boore, David M.

    1986-01-01

    A seismological stochastic model for earthquake ground-motion description is presented. Seismological models are based on the physical properties of the source and the medium and have significant advantages over the widely used empirical models. The model discussed here provides a convenient form for estimating structural response by using random vibration theory. A commonly used random process for ground acceleration, filtered white-noise multiplied by an envelope function, introduces some errors in response calculations for structures whose periods are longer than the faulting duration. An alternate random process, filtered shot-noise process, eliminates these errors.

  16. Wavefront reconstruction algorithm based on Legendre polynomials for radial shearing interferometry over a square area and error analysis.

    PubMed

    Kewei, E; Zhang, Chen; Li, Mengyang; Xiong, Zhao; Li, Dahai

    2015-08-10

    Based on the Legendre polynomials expressions and its properties, this article proposes a new approach to reconstruct the distorted wavefront under test of a laser beam over square area from the phase difference data obtained by a RSI system. And the result of simulation and experimental results verifies the reliability of the method proposed in this paper. The formula of the error propagation coefficients is deduced when the phase difference data of overlapping area contain noise randomly. The matrix T which can be used to evaluate the impact of high-orders Legendre polynomial terms on the outcomes of the low-order terms due to mode aliasing is proposed, and the magnitude of impact can be estimated by calculating the F norm of the T. In addition, the relationship between ratio shear, sampling points, terms of polynomials and noise propagation coefficients, and the relationship between ratio shear, sampling points and norms of the T matrix are both analyzed, respectively. Those research results can provide an optimization design way for radial shearing interferometry system with the theoretical reference and instruction.

  17. Seabed roughness parameters from joint backscatter and reflection inversion at the Malta Plateau.

    PubMed

    Steininger, Gavin; Holland, Charles W; Dosso, Stan E; Dettmer, Jan

    2013-09-01

    This paper presents estimates of seabed roughness and geoacoustic parameters and uncertainties on the Malta Plateau, Mediterranean Sea, by joint Bayesian inversion of mono-static backscatter and spherical wave reflection-coefficient data. The data are modeled using homogeneous fluid sediment layers overlying an elastic basement. The scattering model assumes a randomly rough water-sediment interface with a von Karman roughness power spectrum. Scattering and reflection data are inverted simultaneously using a population of interacting Markov chains to sample roughness and geoacoustic parameters as well as residual error parameters. Trans-dimensional sampling is applied to treat the number of sediment layers and the order (zeroth or first) of an autoregressive error model (to represent potential residual correlation) as unknowns. Results are considered in terms of marginal posterior probability profiles and distributions, which quantify the effective data information content to resolve scattering/geoacoustic structure. Results indicate well-defined scattering (roughness) parameters in good agreement with existing measurements, and a multi-layer sediment profile over a high-speed (elastic) basement, consistent with independent knowledge of sand layers over limestone.

  18. Model parameter estimation approach based on incremental analysis for lithium-ion batteries without using open circuit voltage

    NASA Astrophysics Data System (ADS)

    Wu, Hongjie; Yuan, Shifei; Zhang, Xi; Yin, Chengliang; Ma, Xuerui

    2015-08-01

    To improve the suitability of lithium-ion battery model under varying scenarios, such as fluctuating temperature and SoC variation, dynamic model with parameters updated realtime should be developed. In this paper, an incremental analysis-based auto regressive exogenous (I-ARX) modeling method is proposed to eliminate the modeling error caused by the OCV effect and improve the accuracy of parameter estimation. Then, its numerical stability, modeling error, and parametric sensitivity are analyzed at different sampling rates (0.02, 0.1, 0.5 and 1 s). To identify the model parameters recursively, a bias-correction recursive least squares (CRLS) algorithm is applied. Finally, the pseudo random binary sequence (PRBS) and urban dynamic driving sequences (UDDSs) profiles are performed to verify the realtime performance and robustness of the newly proposed model and algorithm. Different sampling rates (1 Hz and 10 Hz) and multiple temperature points (5, 25, and 45 °C) are covered in our experiments. The experimental and simulation results indicate that the proposed I-ARX model can present high accuracy and suitability for parameter identification without using open circuit voltage.

  19. Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection.

    PubMed

    Zeng, Xueqiang; Luo, Gang

    2017-12-01

    Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.

  20. Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: Focus on the False Discovery Rate and Simulation Study

    PubMed Central

    Dudoit, Sandrine; Gilbert, Houston N.; van der Laan, Mark J.

    2014-01-01

    Summary This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP(q, g) = Pr(g(Vn, Sn) > q), and generalized expected value (gEV) error rates, gEV(g) = E[g(Vn, Sn)], for arbitrary functions g(Vn, Sn) of the numbers of false positives Vn and true positives Sn. Of particular interest are error rates based on the proportion g(Vn, Sn) = Vn/(Vn + Sn) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E[Vn/(Vn + Sn)]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure. PMID:18932138

  1. Ant-inspired density estimation via random walks

    PubMed Central

    Musco, Cameron; Su, Hsin-Hao

    2017-01-01

    Many ant species use distributed population density estimation in applications ranging from quorum sensing, to task allocation, to appraisal of enemy colony strength. It has been shown that ants estimate local population density by tracking encounter rates: The higher the density, the more often the ants bump into each other. We study distributed density estimation from a theoretical perspective. We prove that a group of anonymous agents randomly walking on a grid are able to estimate their density within a small multiplicative error in few steps by measuring their rates of encounter with other agents. Despite dependencies inherent in the fact that nearby agents may collide repeatedly (and, worse, cannot recognize when this happens), our bound nearly matches what would be required to estimate density by independently sampling grid locations. From a biological perspective, our work helps shed light on how ants and other social insects can obtain relatively accurate density estimates via encounter rates. From a technical perspective, our analysis provides tools for understanding complex dependencies in the collision probabilities of multiple random walks. We bound the strength of these dependencies using local mixing properties of the underlying graph. Our results extend beyond the grid to more general graphs, and we discuss applications to size estimation for social networks, density estimation for robot swarms, and random walk-based sampling for sensor networks. PMID:28928146

  2. Classification of echolocation clicks from odontocetes in the Southern California Bight.

    PubMed

    Roch, Marie A; Klinck, Holger; Baumann-Pickering, Simone; Mellinger, David K; Qui, Simon; Soldevilla, Melissa S; Hildebrand, John A

    2011-01-01

    This study presents a system for classifying echolocation clicks of six species of odontocetes in the Southern California Bight: Visually confirmed bottlenose dolphins, short- and long-beaked common dolphins, Pacific white-sided dolphins, Risso's dolphins, and presumed Cuvier's beaked whales. Echolocation clicks are represented by cepstral feature vectors that are classified by Gaussian mixture models. A randomized cross-validation experiment is designed to provide conditions similar to those found in a field-deployed system. To prevent matched conditions from inappropriately lowering the error rate, echolocation clicks associated with a single sighting are never split across the training and test data. Sightings are randomly permuted before assignment to folds in the experiment. This allows different combinations of the training and test data to be used while keeping data from each sighting entirely in the training or test set. The system achieves a mean error rate of 22% across 100 randomized three-fold cross-validation experiments. Four of the six species had mean error rates lower than the overall mean, with the presumed Cuvier's beaked whale clicks showing the best performance (<2% error rate). Long-beaked common and bottlenose dolphins proved the most difficult to classify, with mean error rates of 53% and 68%, respectively.

  3. Effects of random tooth profile errors on the dynamic behaviors of planetary gears

    NASA Astrophysics Data System (ADS)

    Xun, Chao; Long, Xinhua; Hua, Hongxing

    2018-02-01

    In this paper, a nonlinear random model is built to describe the dynamics of planetary gear trains (PGTs), in which the time-varying mesh stiffness, tooth profile modification (TPM), tooth contact loss, and random tooth profile error are considered. A stochastic method based on the method of multiple scales (MMS) is extended to analyze the statistical property of the dynamic performance of PGTs. By the proposed multiple-scales based stochastic method, the distributions of the dynamic transmission errors (DTEs) are investigated, and the lower and upper bounds are determined based on the 3σ principle. Monte Carlo method is employed to verify the proposed method. Results indicate that the proposed method can be used to determine the distribution of the DTE of PGTs high efficiently and allow a link between the manufacturing precision and the dynamical response. In addition, the effects of tooth profile modification on the distributions of vibration amplitudes and the probability of tooth contact loss with different manufacturing tooth profile errors are studied. The results show that the manufacturing precision affects the distribution of dynamic transmission errors dramatically and appropriate TPMs are helpful to decrease the nominal value and the deviation of the vibration amplitudes.

  4. A multi-site analysis of random error in tower-based measurements of carbon and energy fluxes

    Treesearch

    Andrew D. Richardson; David Y. Hollinger; George G. Burba; Kenneth J. Davis; Lawrence B. Flanagan; Gabriel G. Katul; J. William Munger; Daniel M. Ricciuto; Paul C. Stoy; Andrew E. Suyker; Shashi B. Verma; Steven C. Wofsy; Steven C. Wofsy

    2006-01-01

    Measured surface-atmosphere fluxes of energy (sensible heat, H, and latent heat, LE) and CO2 (FCO2) represent the ``true?? flux plus or minus potential random and systematic measurement errors. Here, we use data from seven sites in the AmeriFlux network, including five forested sites (two of which include ``tall tower?? instrumentation), one grassland site, and one...

  5. Statistical error model for a solar electric propulsion thrust subsystem

    NASA Technical Reports Server (NTRS)

    Bantell, M. H.

    1973-01-01

    The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.

  6. Far field beam pattern of one MW combined beam of laser diode array amplifiers for space power transmission

    NASA Technical Reports Server (NTRS)

    Kwon, Jin H.; Lee, Ja H.

    1989-01-01

    The far-field beam pattern and the power-collection efficiency are calculated for a multistage laser-diode-array amplifier consisting of about 200,000 5-W laser diode arrays with random distributions of phase and orientation errors and random diode failures. From the numerical calculation it is found that the far-field beam pattern is little affected by random failures of up to 20 percent of the laser diodes with reference of 80 percent receiving efficiency in the center spot. The random differences in phases among laser diodes due to probable manufacturing errors is allowed to about 0.2 times the wavelength. The maximum allowable orientation error is about 20 percent of the diffraction angle of a single laser diode aperture (about 1 cm). The preliminary results indicate that the amplifier could be used for space beam-power transmission with an efficiency of about 80 percent for a moderate-size (3-m-diameter) receiver placed at a distance of less than 50,000 km.

  7. An Analysis of Computational Errors in the Use of Division Algorithms by Fourth-Grade Students.

    ERIC Educational Resources Information Center

    Stefanich, Greg P.; Rokusek, Teri

    1992-01-01

    Presents a study that analyzed errors made by randomly chosen fourth grade students (25 of 57) while using the division algorithm and investigated the effect of remediation on identified systematic errors. Results affirm that error pattern diagnosis and directed remediation lead to new learning and long-term retention. (MDH)

  8. False Positives in Multiple Regression: Unanticipated Consequences of Measurement Error in the Predictor Variables

    ERIC Educational Resources Information Center

    Shear, Benjamin R.; Zumbo, Bruno D.

    2013-01-01

    Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…

  9. Space resection model calculation based on Random Sample Consensus algorithm

    NASA Astrophysics Data System (ADS)

    Liu, Xinzhu; Kang, Zhizhong

    2016-03-01

    Resection has been one of the most important content in photogrammetry. It aims at the position and attitude information of camera at the shooting point. However in some cases, the observed values for calculating are with gross errors. This paper presents a robust algorithm that using RANSAC method with DLT model can effectually avoiding the difficulties to determine initial values when using co-linear equation. The results also show that our strategies can exclude crude handicap and lead to an accurate and efficient way to gain elements of exterior orientation.

  10. SU-E-J-119: What Effect Have the Volume Defined in the Alignment Clipbox for Cervical Cancer Using Automatic Registration Methods for Cone- Beam CT Verification?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, W; Yang, H; Wang, Y

    2014-06-01

    Purpose: To investigate the impact of different clipbox volumes with automated registration techniques using commercially available software with on board volumetric imaging(OBI) for treatment verification in cervical cancer patients. Methods: Fifty cervical cancer patients received daily CBCT scans(on-board imaging v1.5 system, Varian Medical Systems) during the first treatment week and weekly thereafter were included this analysis. A total of 450 CBCT scans were registered to the planning CTscan using pelvic clipbox(clipbox-Pelvic) and around PTV clip box(clipbox- PTV). The translations(anterior-posterior, left-right, superior-inferior) and the rotations(yaw, pitch and roll) errors for each matches were recorded. The setup errors and the systematic andmore » random errors for both of the clip-boxes were calculated. Paired Samples t test was used to analysis the differences between clipbox-Pelvic and clipbox-PTV. Results: . The SD of systematic error(σ) was 1.0mm, 2.0mm,3.2mm and 1.9mm,2.3mm, 3.0mm in the AP, LR and SI directions for clipbox-Pelvic and clipbox-PTV, respectively. The average random error(Σ)was 1.7mm, 2.0mm,4.2mm and 1.7mm,3.4mm, 4.4mm in the AP, LR and SI directions for clipbox-Pelvic and clipbox-PTV, respectively. But, only the SI direction was acquired significantly differences between two image registration volumes(p=0.002,p=0.01 for mean and SD). For rotations, the yaw mean/SD and the pitch SD were acquired significantly differences between clipbox-Pelvic and clipbox-PTV. Conclusion: The defined volume for Image registration is important for cervical cancer when 3D/3D match was used. The alignment clipbox can effect the setup errors obtained. Further analysis is need to determine the optimal defined volume to use the image registration in cervical cancer. Conflict of interest: none.« less

  11. Routing of Fatty Acids from Fresh Grass to Milk Restricts the Validation of Feeding Information Obtained by Measuring (13)C in Milk.

    PubMed

    Auerswald, Karl; Schäufele, Rudi; Bellof, Gerhard

    2015-12-09

    Dairy production systems vary widely in their feeding and livestock-keeping regimens. Both are well-known to affect milk quality and consumer perceptions. Stable isotope analysis has been suggested as an easy-to-apply tool to validate a claimed feeding regimen. Although it is unambiguous that feeding influences the carbon isotope composition (δ(13)C) in milk, it is not clear whether a reported feeding regimen can be verified by measuring δ(13)C in milk without sampling and analyzing the feed. We obtained 671 milk samples from 40 farms distributed over Central Europe to measure δ(13)C and fatty acid composition. Feeding protocols by the farmers in combination with a model based on δ(13)C feed values from the literature were used to predict δ(13)C in feed and subsequently in milk. The model considered dietary contributions of C3 and C4 plants, contribution of concentrates, altitude, seasonal variation in (12/13)CO2, Suess's effect, and diet-milk discrimination. Predicted and measured δ(13)C in milk correlated closely (r(2) = 0.93). Analyzing milk for δ(13)C allowed validation of a reported C4 component with an error of <8% in 95% of all cases. This included the error of the method (measurement and prediction) and the error of the feeding information. However, the error was not random but varied seasonally and correlated with the seasonal variation in long-chain fatty acids. This indicated a bypass of long-chain fatty acids from fresh grass to milk.

  12. Frequency position modulation using multi-spectral projections

    NASA Astrophysics Data System (ADS)

    Goodman, Joel; Bertoncini, Crystal; Moore, Michael; Nousain, Bryan; Cowart, Gregory

    2012-10-01

    In this paper we present an approach to harness multi-spectral projections (MSPs) to carefully shape and locate tones in the spectrum, enabling a new and robust modulation in which a signal's discrete frequency support is used to represent symbols. This method, called Frequency Position Modulation (FPM), is an innovative extension to MT-FSK and OFDM and can be non-uniformly spread over many GHz of instantaneous bandwidth (IBW), resulting in a communications system that is difficult to intercept and jam. The FPM symbols are recovered using adaptive projections that in part employ an analog polynomial nonlinearity paired with an analog-to-digital converter (ADC) sampling at a rate at that is only a fraction of the IBW of the signal. MSPs also facilitate using commercial of-the-shelf (COTS) ADCs with uniform-sampling, standing in sharp contrast to random linear projections by random sampling, which requires a full Nyquist rate sample-and-hold. Our novel communication system concept provides an order of magnitude improvement in processing gain over conventional LPI/LPD communications (e.g., FH- or DS-CDMA) and facilitates the ability to operate in interference laden environments where conventional compressed sensing receivers would fail. We quantitatively analyze the bit error rate (BER) and processing gain (PG) for a maximum likelihood based FPM demodulator and demonstrate its performance in interference laden conditions.

  13. A predictability study of Lorenz's 28-variable model as a dynamical system

    NASA Technical Reports Server (NTRS)

    Krishnamurthy, V.

    1993-01-01

    The dynamics of error growth in a two-layer nonlinear quasi-geostrophic model has been studied to gain an understanding of the mathematical theory of atmospheric predictability. The growth of random errors of varying initial magnitudes has been studied, and the relation between this classical approach and the concepts of the nonlinear dynamical systems theory has been explored. The local and global growths of random errors have been expressed partly in terms of the properties of an error ellipsoid and the Liapunov exponents determined by linear error dynamics. The local growth of small errors is initially governed by several modes of the evolving error ellipsoid but soon becomes dominated by the longest axis. The average global growth of small errors is exponential with a growth rate consistent with the largest Liapunov exponent. The duration of the exponential growth phase depends on the initial magnitude of the errors. The subsequent large errors undergo a nonlinear growth with a steadily decreasing growth rate and attain saturation that defines the limit of predictability. The degree of chaos and the largest Liapunov exponent show considerable variation with change in the forcing, which implies that the time variation in the external forcing can introduce variable character to the predictability.

  14. Adjustment of Measurements with Multiplicative Errors: Error Analysis, Estimates of the Variance of Unit Weight, and Effect on Volume Estimation from LiDAR-Type Digital Elevation Models

    PubMed Central

    Shi, Yun; Xu, Peiliang; Peng, Junhuan; Shi, Chuang; Liu, Jingnan

    2014-01-01

    Modern observation technology has verified that measurement errors can be proportional to the true values of measurements such as GPS, VLBI baselines and LiDAR. Observational models of this type are called multiplicative error models. This paper is to extend the work of Xu and Shimada published in 2000 on multiplicative error models to analytical error analysis of quantities of practical interest and estimates of the variance of unit weight. We analytically derive the variance-covariance matrices of the three least squares (LS) adjustments, the adjusted measurements and the corrections of measurements in multiplicative error models. For quality evaluation, we construct five estimators for the variance of unit weight in association of the three LS adjustment methods. Although LiDAR measurements are contaminated with multiplicative random errors, LiDAR-based digital elevation models (DEM) have been constructed as if they were of additive random errors. We will simulate a model landslide, which is assumed to be surveyed with LiDAR, and investigate the effect of LiDAR-type multiplicative error measurements on DEM construction and its effect on the estimate of landslide mass volume from the constructed DEM. PMID:24434880

  15. Liquid Medication Dosing Errors by Hispanic Parents: Role of Health Literacy and English Proficiency

    PubMed Central

    Harris, Leslie M.; Dreyer, Benard; Mendelsohn, Alan; Bailey, Stacy C.; Sanders, Lee M.; Wolf, Michael S.; Parker, Ruth M.; Patel, Deesha A.; Kim, Kwang Youn A.; Jimenez, Jessica J.; Jacobson, Kara; Smith, Michelle; Yin, H. Shonna

    2016-01-01

    Objective Hispanic parents in the US are disproportionately affected by low health literacy and limited English proficiency (LEP). We examined associations between health literacy, LEP, and liquid medication dosing errors in Hispanic parents. Methods Cross-sectional analysis of data from a multisite randomized controlled experiment to identify best practices for the labeling/dosing of pediatric liquid medications (SAFE Rx for Kids study); 3 urban pediatric clinics. Analyses were limited to Hispanic parents of children <8 years, with health literacy and LEP data (n=1126). Parents were randomized to 5 groups that varied by pairing of units of measurement on the label/dosing tool. Each parent measured 9 doses [3 amounts (2.5,5,7.5 mL) using 3 tools (2 syringes (0.2,0.5 mL increment), 1 cup)] in random order. Dependent variable: Dosing error=>20% dose deviation. Predictor variables: health literacy (Newest Vital Sign) [limited=0–3; adequate=4–6], LEP (speaks English less than “very well”). Results 83.1% made dosing errors (mean(SD) errors/parent=2.2(1.9)). Parents with limited health literacy and LEP had the greatest odds of making a dosing error compared to parents with adequate health literacy who were English proficient (% trials with errors/parent=28.8 vs. 12.9%; AOR=2.2[1.7–2.8]). Parents with limited health literacy who were English proficient were also more likely to make errors (% trials with errors/parent=18.8%; AOR=1.4[1.1–1.9]). Conclusion Dosing errors are common among Hispanic parents; those with both LEP and limited health literacy are at particular risk. Further study is needed to examine how the redesign of medication labels and dosing tools could reduce literacy and language-associated disparities in dosing errors. PMID:28477800

  16. Combinatorial neural codes from a mathematical coding theory perspective.

    PubMed

    Curto, Carina; Itskov, Vladimir; Morrison, Katherine; Roth, Zachary; Walker, Judy L

    2013-07-01

    Shannon's seminal 1948 work gave rise to two distinct areas of research: information theory and mathematical coding theory. While information theory has had a strong influence on theoretical neuroscience, ideas from mathematical coding theory have received considerably less attention. Here we take a new look at combinatorial neural codes from a mathematical coding theory perspective, examining the error correction capabilities of familiar receptive field codes (RF codes). We find, perhaps surprisingly, that the high levels of redundancy present in these codes do not support accurate error correction, although the error-correcting performance of receptive field codes catches up to that of random comparison codes when a small tolerance to error is introduced. However, receptive field codes are good at reflecting distances between represented stimuli, while the random comparison codes are not. We suggest that a compromise in error-correcting capability may be a necessary price to pay for a neural code whose structure serves not only error correction, but must also reflect relationships between stimuli.

  17. Effects of learning climate and registered nurse staffing on medication errors.

    PubMed

    Chang, Yunkyung; Mark, Barbara

    2011-01-01

    Despite increasing recognition of the significance of learning from errors, little is known about how learning climate contributes to error reduction. The purpose of this study was to investigate whether learning climate moderates the relationship between error-producing conditions and medication errors. A cross-sectional descriptive study was done using data from 279 nursing units in 146 randomly selected hospitals in the United States. Error-producing conditions included work environment factors (work dynamics and nurse mix), team factors (communication with physicians and nurses' expertise), personal factors (nurses' education and experience), patient factors (age, health status, and previous hospitalization), and medication-related support services. Poisson models with random effects were used with the nursing unit as the unit of analysis. A significant negative relationship was found between learning climate and medication errors. It also moderated the relationship between nurse mix and medication errors: When learning climate was negative, having more registered nurses was associated with fewer medication errors. However, no relationship was found between nurse mix and medication errors at either positive or average levels of learning climate. Learning climate did not moderate the relationship between work dynamics and medication errors. The way nurse mix affects medication errors depends on the level of learning climate. Nursing units with fewer registered nurses and frequent medication errors should examine their learning climate. Future research should be focused on the role of learning climate as related to the relationships between nurse mix and medication errors.

  18. Random synaptic feedback weights support error backpropagation for deep learning

    NASA Astrophysics Data System (ADS)

    Lillicrap, Timothy P.; Cownden, Daniel; Tweed, Douglas B.; Akerman, Colin J.

    2016-11-01

    The brain processes information through multiple layers of neurons. This deep architecture is representationally powerful, but complicates learning because it is difficult to identify the responsible neurons when a mistake is made. In machine learning, the backpropagation algorithm assigns blame by multiplying error signals with all the synaptic weights on each neuron's axon and further downstream. However, this involves a precise, symmetric backward connectivity pattern, which is thought to be impossible in the brain. Here we demonstrate that this strong architectural constraint is not required for effective error propagation. We present a surprisingly simple mechanism that assigns blame by multiplying errors by even random synaptic weights. This mechanism can transmit teaching signals across multiple layers of neurons and performs as effectively as backpropagation on a variety of tasks. Our results help reopen questions about how the brain could use error signals and dispel long-held assumptions about algorithmic constraints on learning.

  19. Random synaptic feedback weights support error backpropagation for deep learning

    PubMed Central

    Lillicrap, Timothy P.; Cownden, Daniel; Tweed, Douglas B.; Akerman, Colin J.

    2016-01-01

    The brain processes information through multiple layers of neurons. This deep architecture is representationally powerful, but complicates learning because it is difficult to identify the responsible neurons when a mistake is made. In machine learning, the backpropagation algorithm assigns blame by multiplying error signals with all the synaptic weights on each neuron's axon and further downstream. However, this involves a precise, symmetric backward connectivity pattern, which is thought to be impossible in the brain. Here we demonstrate that this strong architectural constraint is not required for effective error propagation. We present a surprisingly simple mechanism that assigns blame by multiplying errors by even random synaptic weights. This mechanism can transmit teaching signals across multiple layers of neurons and performs as effectively as backpropagation on a variety of tasks. Our results help reopen questions about how the brain could use error signals and dispel long-held assumptions about algorithmic constraints on learning. PMID:27824044

  20. Pricing Employee Stock Options (ESOs) with Random Lattice

    NASA Astrophysics Data System (ADS)

    Chendra, E.; Chin, L.; Sukmana, A.

    2018-04-01

    Employee Stock Options (ESOs) are stock options granted by companies to their employees. Unlike standard options that can be traded by typical institutional or individual investors, employees cannot sell or transfer their ESOs to other investors. The sale restrictions may induce the ESO’s holder to exercise them earlier. In much cited paper, Hull and White propose a binomial lattice in valuing ESOs which assumes that employees will exercise voluntarily their ESOs if the stock price reaches a horizontal psychological barrier. Due to nonlinearity errors, the numerical pricing results oscillate significantly so they may lead to large pricing errors. In this paper, we use the random lattice method to price the Hull-White ESOs model. This method can reduce the nonlinearity error by aligning a layer of nodes of the random lattice with a psychological barrier.

  1. The influence of random element displacement on DOA estimates obtained with (Khatri-Rao-)root-MUSIC.

    PubMed

    Inghelbrecht, Veronique; Verhaevert, Jo; van Hecke, Tanja; Rogier, Hendrik

    2014-11-11

    Although a wide range of direction of arrival (DOA) estimation algorithms has been described for a diverse range of array configurations, no specific stochastic analysis framework has been established to assess the probability density function of the error on DOA estimates due to random errors in the array geometry. Therefore, we propose a stochastic collocation method that relies on a generalized polynomial chaos expansion to connect the statistical distribution of random position errors to the resulting distribution of the DOA estimates. We apply this technique to the conventional root-MUSIC and the Khatri-Rao-root-MUSIC methods. According to Monte-Carlo simulations, this novel approach yields a speedup by a factor of more than 100 in terms of CPU-time for a one-dimensional case and by a factor of 56 for a two-dimensional case.

  2. Validation of a sampling plan to generate food composition data.

    PubMed

    Sammán, N C; Gimenez, M A; Bassett, N; Lobo, M O; Marcoleri, M E

    2016-02-15

    A methodology to develop systematic plans for food sampling was proposed. Long life whole and skimmed milk, and sunflower oil were selected to validate the methodology in Argentina. Fatty acid profile in all foods, proximal composition, and calcium's content in milk were determined with AOAC methods. The number of samples (n) was calculated applying Cochran's formula with variation coefficients ⩽12% and an estimate error (r) maximum permissible ⩽5% for calcium content in milks and unsaturated fatty acids in oil. n were 9, 11 and 21 for long life whole and skimmed milk, and sunflower oil respectively. Sample units were randomly collected from production sites and sent to labs. Calculated r with experimental data was ⩽10%, indicating high accuracy in the determination of analyte content of greater variability and reliability of the proposed sampling plan. The methodology is an adequate and useful tool to develop sampling plans for food composition analysis. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Estimating accuracy of land-cover composition from two-stage cluster sampling

    USGS Publications Warehouse

    Stehman, S.V.; Wickham, J.D.; Fattorini, L.; Wade, T.D.; Baffetta, F.; Smith, J.H.

    2009-01-01

    Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias. ?? 2009 Elsevier Inc.

  4. Valid statistical inference methods for a case-control study with missing data.

    PubMed

    Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun

    2018-04-01

    The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.

  5. Uncorrected refractive errors and spectacle utilisation rate in Tehran: the unmet need

    PubMed Central

    Fotouhi, A; Hashemi, H; Raissi, B; Mohammad, K

    2006-01-01

    Aim To determine the prevalence of the met and unmet need for spectacles and their associated factors in the population of Tehran. Methods 6497 Tehran citizens were enrolled through random cluster sampling and were invited to a clinic for an interview and ophthalmic examination. 4354 (70.3%) participated in the survey, and refraction measurement results of 4353 people aged 5 years and over are presented. The unmet need for spectacles was defined as the proportion of people who did not use spectacles despite a correctable visual acuity of worse than 20/40 in the better eye. Results The need for spectacles in the studied population, standardised for age and sex, was 14.1% (95% confidence interval (CI), 12.8% to 15.4%). This need was met with appropriate spectacles in 416 people (9.3% of the total sample), while it was unmet in 230 people, representing 4.8% of the total sample population (95% CI, 4.1% to 5.4%). The spectacle coverage rate (met need/(met need + unmet need)) was 66.0%. Multivariate logistic regression showed that variables of age, education, and type of refractive error were associated with lack of spectacle correction. There was an increase in the unmet need with older age, lesser education, and myopia. Conclusion This survey determined the met and unmet need for spectacles in a Tehran population. It also identified high risk groups with uncorrected refractive errors to guide intervention programmes for the society. While the study showed the unmet need for spectacles and its determinants, more extensive studies towards the causes of unmet need are recommended. PMID:16488929

  6. Chemical library subset selection algorithms: a unified derivation using spatial statistics.

    PubMed

    Hamprecht, Fred A; Thiel, Walter; van Gunsteren, Wilfred F

    2002-01-01

    If similar compounds have similar activity, rational subset selection becomes superior to random selection in screening for pharmacological lead discovery programs. Traditional approaches to this experimental design problem fall into two classes: (i) a linear or quadratic response function is assumed (ii) some space filling criterion is optimized. The assumptions underlying the first approach are clear but not always defendable; the second approach yields more intuitive designs but lacks a clear theoretical foundation. We model activity in a bioassay as realization of a stochastic process and use the best linear unbiased estimator to construct spatial sampling designs that optimize the integrated mean square prediction error, the maximum mean square prediction error, or the entropy. We argue that our approach constitutes a unifying framework encompassing most proposed techniques as limiting cases and sheds light on their underlying assumptions. In particular, vector quantization is obtained, in dimensions up to eight, in the limiting case of very smooth response surfaces for the integrated mean square error criterion. Closest packing is obtained for very rough surfaces under the integrated mean square error and entropy criteria. We suggest to use either the integrated mean square prediction error or the entropy as optimization criteria rather than approximations thereof and propose a scheme for direct iterative minimization of the integrated mean square prediction error. Finally, we discuss how the quality of chemical descriptors manifests itself and clarify the assumptions underlying the selection of diverse or representative subsets.

  7. Electronic laboratory system reduces errors in National Tuberculosis Program: a cluster randomized controlled trial.

    PubMed

    Blaya, J A; Shin, S S; Yale, G; Suarez, C; Asencios, L; Contreras, C; Rodriguez, P; Kim, J; Cegielski, P; Fraser, H S F

    2010-08-01

    To evaluate the impact of the e-Chasqui laboratory information system in reducing reporting errors compared to the current paper system. Cluster randomized controlled trial in 76 health centers (HCs) between 2004 and 2008. Baseline data were collected every 4 months for 12 months. HCs were then randomly assigned to intervention (e-Chasqui) or control (paper). Further data were collected for the same months the following year. Comparisons were made between intervention and control HCs, and before and after the intervention. Intervention HCs had respectively 82% and 87% fewer errors in reporting results for drug susceptibility tests (2.1% vs. 11.9%, P = 0.001, OR 0.17, 95%CI 0.09-0.31) and cultures (2.0% vs. 15.1%, P < 0.001, OR 0.13, 95%CI 0.07-0.24), than control HCs. Preventing missing results through online viewing accounted for at least 72% of all errors. e-Chasqui users sent on average three electronic error reports per week to the laboratories. e-Chasqui reduced the number of missing laboratory results at point-of-care health centers. Clinical users confirmed viewing electronic results not available on paper. Reporting errors to the laboratory using e-Chasqui promoted continuous quality improvement. The e-Chasqui laboratory information system is an important part of laboratory infrastructure improvements to support multidrug-resistant tuberculosis care in Peru.

  8. Application of a bioenergetics model for hatchery production: Largemouth bass fed commercial diets

    USGS Publications Warehouse

    Csargo, Isak J.; Michael L. Brown,; Chipps, Steven R.

    2012-01-01

    Fish bioenergetics models based on natural prey items have been widely used to address research and management questions. However, few attempts have been made to evaluate and apply bioenergetics models to hatchery-reared fish receiving commercial feeds that contain substantially higher energy densities than natural prey. In this study, we evaluated a bioenergetics model for age-0 largemouth bass Micropterus salmoidesreared on four commercial feeds. Largemouth bass (n ≈ 3,504) were reared for 70 d at 25°C in sixteen 833-L circular tanks connected in parallel to a recirculation system. Model performance was evaluated using error components (mean, slope, and random) derived from decomposition of the mean square error obtained from regression of observed on predicted values. Mean predicted consumption was only 8.9% lower than mean observed consumption and was similar to error rates observed for largemouth bass consuming natural prey. Model evaluation showed that the 97.5% joint confidence region included the intercept of 0 (−0.43 ± 3.65) and slope of 1 (1.08 ± 0.20), which indicates the model accurately predicted consumption. Moreover model error was similar among feeds (P = 0.98), and most error was probably attributable to sampling error (unconsumed feed), underestimated predator energy densities, or consumption-dependent error, which is common in bioenergetics models. This bioenergetics model could provide a valuable tool in hatchery production of largemouth bass. Furthermore, we believe that bioenergetics modeling could be useful in aquaculture production, particularly for species lacking historical hatchery constants or conventional growth models.

  9. The Impact of Subsampling on MODIS Level-3 Statistics of Cloud Optical Thickness and Effective Radius

    NASA Technical Reports Server (NTRS)

    Oreopoulos, Lazaros

    2004-01-01

    The MODIS Level-3 optical thickness and effective radius cloud product is a gridded l deg. x 1 deg. dataset that is derived from aggregation and subsampling at 5 km of 1 km, resolution Level-2 orbital swath data (Level-2 granules). This study examines the impact of the 5 km subsampling on the mean, standard deviation and inhomogeneity parameter statistics of optical thickness and effective radius. The methodology is simple and consists of estimating mean errors for a large collection of Terra and Aqua Level-2 granules by taking the difference of the statistics at the original and subsampled resolutions. It is shown that the Level-3 sampling does not affect the various quantities investigated to the same degree, with second order moments suffering greater subsampling errors, as expected. Mean errors drop dramatically when averages over a sufficient number of regions (e.g., monthly and/or latitudinal averages) are taken, pointing to a dominance of errors that are of random nature. When histograms built from subsampled data with the same binning rules as in the Level-3 dataset are used to reconstruct the quantities of interest, the mean errors do not deteriorate significantly. The results in this paper provide guidance to users of MODIS Level-3 optical thickness and effective radius cloud products on the range of errors due to subsampling they should expect and perhaps account for, in scientific work with this dataset. In general, subsampling errors should not be a serious concern when moderate temporal and/or spatial averaging is performed.

  10. P-value interpretation and alpha allocation in clinical trials.

    PubMed

    Moyé, L A

    1998-08-01

    Although much value has been placed on type I error event probabilities in clinical trials, interpretive difficulties often arise that are directly related to clinical trial complexity. Deviations of the trial execution from its protocol, the presence of multiple treatment arms, and the inclusion of multiple end points complicate the interpretation of an experiment's reported alpha level. The purpose of this manuscript is to formulate the discussion of P values (and power for studies showing no significant differences) on the basis of the event whose relative frequency they represent. Experimental discordance (discrepancies between the protocol's directives and the experiment's execution) is linked to difficulty in alpha and beta interpretation. Mild experimental discordance leads to an acceptable adjustment for alpha or beta, while severe discordance results in their corruption. Finally, guidelines are provided for allocating type I error among a collection of end points in a prospectively designed, randomized controlled clinical trial. When considering secondary end point inclusion in clinical trials, investigators should increase the sample size to preserve the type I error rates at acceptable levels.

  11. Analytic Perturbation Method for Estimating Ground Flash Fraction from Satellite Lightning Observations

    NASA Technical Reports Server (NTRS)

    Koshak, William; Solakiewicz, Richard

    2013-01-01

    An analytic perturbation method is introduced for estimating the lightning ground flash fraction in a set of N lightning flashes observed by a satellite lightning mapper. The value of N is large, typically in the thousands, and the observations consist of the maximum optical group area produced by each flash. The method is tested using simulated observations that are based on Optical Transient Detector (OTD) and Lightning Imaging Sensor (LIS) data. National Lightning Detection NetworkTM (NLDN) data is used to determine the flash-type (ground or cloud) of the satellite-observed flashes, and provides the ground flash fraction truth for the simulation runs. It is found that the mean ground flash fraction retrieval errors are below 0.04 across the full range 0-1 under certain simulation conditions. In general, it is demonstrated that the retrieval errors depend on many factors (i.e., the number, N, of satellite observations, the magnitude of random and systematic measurement errors, and the number of samples used to form certain climate distributions employed in the model).

  12. Revision of laser-induced damage threshold evaluation from damage probability data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bataviciute, Gintare; Grigas, Povilas; Smalakys, Linas

    2013-04-15

    In this study, the applicability of commonly used Damage Frequency Method (DFM) is addressed in the context of Laser-Induced Damage Threshold (LIDT) testing with pulsed lasers. A simplified computer model representing the statistical interaction between laser irradiation and randomly distributed damage precursors is applied for Monte Carlo experiments. The reproducibility of LIDT predicted from DFM is examined under both idealized and realistic laser irradiation conditions by performing numerical 1-on-1 tests. A widely accepted linear fitting resulted in systematic errors when estimating LIDT and its error bars. For the same purpose, a Bayesian approach was proposed. A novel concept of parametricmore » regression based on varying kernel and maximum likelihood fitting technique is introduced and studied. Such approach exhibited clear advantages over conventional linear fitting and led to more reproducible LIDT evaluation. Furthermore, LIDT error bars are obtained as a natural outcome of parametric fitting which exhibit realistic values. The proposed technique has been validated on two conventionally polished fused silica samples (355 nm, 5.7 ns).« less

  13. Effect of MLC leaf position, collimator rotation angle, and gantry rotation angle errors on intensity-modulated radiotherapy plans for nasopharyngeal carcinoma

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bai, Sen; Li, Guangjun; Wang, Maojie

    The purpose of this study was to investigate the effect of multileaf collimator (MLC) leaf position, collimator rotation angle, and accelerator gantry rotation angle errors on intensity-modulated radiotherapy plans for nasopharyngeal carcinoma. To compare dosimetric differences between the simulating plans and the clinical plans with evaluation parameters, 6 patients with nasopharyngeal carcinoma were selected for simulation of systematic and random MLC leaf position errors, collimator rotation angle errors, and accelerator gantry rotation angle errors. There was a high sensitivity to dose distribution for systematic MLC leaf position errors in response to field size. When the systematic MLC position errors weremore » 0.5, 1, and 2 mm, respectively, the maximum values of the mean dose deviation, observed in parotid glands, were 4.63%, 8.69%, and 18.32%, respectively. The dosimetric effect was comparatively small for systematic MLC shift errors. For random MLC errors up to 2 mm and collimator and gantry rotation angle errors up to 0.5°, the dosimetric effect was negligible. We suggest that quality control be regularly conducted for MLC leaves, so as to ensure that systematic MLC leaf position errors are within 0.5 mm. Because the dosimetric effect of 0.5° collimator and gantry rotation angle errors is negligible, it can be concluded that setting a proper threshold for allowed errors of collimator and gantry rotation angle may increase treatment efficacy and reduce treatment time.« less

  14. PRECISE TULLY-FISHER RELATIONS WITHOUT GALAXY INCLINATIONS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Obreschkow, D.; Meyer, M.

    2013-11-10

    Power-law relations between tracers of baryonic mass and rotational velocities of disk galaxies, so-called Tully-Fisher relations (TFRs), offer a wealth of applications in galaxy evolution and cosmology. However, measurements of rotational velocities require galaxy inclinations, which are difficult to measure, thus limiting the range of TFR studies. This work introduces a maximum likelihood estimation (MLE) method for recovering the TFR in galaxy samples with limited or no information on inclinations. The robustness and accuracy of this method is demonstrated using virtual and real galaxy samples. Intriguingly, the MLE reliably recovers the TFR of all test samples, even without using anymore » inclination measurements—that is, assuming a random sin i-distribution for galaxy inclinations. Explicitly, this 'inclination-free MLE' recovers the three TFR parameters (zero-point, slope, scatter) with statistical errors only about 1.5 times larger than the best estimates based on perfectly known galaxy inclinations with zero uncertainty. Thus, given realistic uncertainties, the inclination-free MLE is highly competitive. If inclination measurements have mean errors larger than 10°, it is better not to use any inclinations than to consider the inclination measurements to be exact. The inclination-free MLE opens interesting perspectives for future H I surveys by the Square Kilometer Array and its pathfinders.« less

  15. Psychrometric measurement of soil water potential: Stability of calibration and test of pressure-plate samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, T.L.; Gee, G.W.; Heller, P.R.

    1990-08-01

    A commercially available thermocouple psychrometer sample changer (Decagon SC-10A) was used to measure the water potential of field soils ranging in texture from sand to silty clay loam over a range of {minus}0.5 to {minus}20.0 MPa. The standard error of prediction based on regression statistics was generally between 0.04 and 0.14 MPa at {minus}5 MPa. Replacing the measuring junction of the unit changed the calibration slightly; however, it did not significantly alter measurement accuracy. Calibration curves measured throughout a year of testing are consistent and indicate no systematic drift in calibration. Most measurement uncertainty is produced by shifts in themore » intercept in the calibration equation rather than the slope. Both the variability in intercept and the regression error seem to be random. Measurements taken with the SC-10A show that water potential in both sand and silt loam samples removed from 1.5-MPa pressure plates was often 0.5 to 1.0 MPa greater than the 1.5-MPa applied pressure. Limited data from 0.5-MPa pressure plates show close agreement between SC-10A measurements and pressure applied to these more permeable plates.« less

  16. Effectiveness of a web-based education program to improve vaccine storage conditions in primary care (Keep Cool): study protocol for a randomized controlled trial.

    PubMed

    Thielmann, Anika; Viehmann, Anja; Weltermann, Birgitta M

    2015-07-14

    Immunization programs are among the most effective public health strategies worldwide. Adequate vaccine storage is a prerequisite to assure the vaccines' effectiveness and safety. In a questionnaire survey among a random sample of German primary care physicians, we discovered vaccine storage deficits: 16% of physicians had experience with cold chain breaches either as an error or near error, 49 % did not keep a temperature log, and 21 % did not use a separate refrigerator for vaccine storage. In a recent feasibility study of 21 practice refrigerators, we showed that these were outside the target range 10.2% of the total time with some single refrigerators being outside the target range as much as 66.3% of the time. These cooling-chain deficits are consistent with the international medical literature, yet an effective, easy to disseminate, practice-centered intervention to improve storage conditions is lacking. This randomized intervention trial will be conducted in a random sample of primary care practices. Based on continuous temperature recordings over 7 days, all practices with readings outside the target range for vaccine storage (+2 °C to +8 °C) will be randomly allocated to a web-based education program or a waiting list control group. The practice physicians and their teams constitute the target population. Participants will be educated about best practices in vaccine storage and will receive a manual including storage checklists and templates for temperature documentation. In all practices, temperatures of the vaccine refrigerators will be monitored continuously using a data logger with a glycol probe as a surrogate for vaccine vial temperature. The effectiveness of the web-based education program will be determined after 6 months in terms of the proportion of refrigerators with vaccine vial temperatures within the target range (+2 °C to +8 °C) during 7-day temperature logging. Secondary outcome parameters include temperature monitoring, no critically low temperatures (≤ -0.5 °C), compliance with storage recommendations, knowledge of good vaccine storage conditions, and assignment of personnel as vaccine storage manager and backup. Keep Cool will develop and evaluate a web-based education program to improve vaccine storage conditions in primary care and thereby ensure immunization safety and effectiveness. DRKS00006561 (date of registration: 20 February 2015).

  17. Center of mass perception and inertial frames of reference.

    PubMed

    Bingham, G P; Muchisky, M M

    1993-11-01

    Center of mass perception was investigated by varying the shape, size, and orientation of planar objects. Shape was manipulated to investigate symmetries as information. The number of reflective symmetry axes, the amount of rotational symmetry, and the presence of radial symmetry were varied. Orientation affected systematic errors. Judgments tended to undershoot the center of mass. Random errors increased with size and decreased with symmetry. Size had no effect on random errors for maximally symmetric objects, although orientation did. The spatial distributions of judgments were elliptical. Distribution axes were found to align with the principle moments of inertia. Major axes tended to align with gravity in maximally symmetric objects. A functional and physical account was given in terms of the repercussions of error. Overall, judgments were very accurate.

  18. Bayesian dynamic modeling of time series of dengue disease case counts.

    PubMed

    Martínez-Bello, Daniel Adyro; López-Quílez, Antonio; Torres-Prieto, Alexander

    2017-07-01

    The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful models for decision-making in public health.

  19. Impact of Internally Developed Electronic Prescription on Prescribing Errors at Discharge from the Emergency Department

    PubMed Central

    Hitti, Eveline; Tamim, Hani; Bakhti, Rinad; Zebian, Dina; Mufarrij, Afif

    2017-01-01

    Introduction Medication errors are common, with studies reporting at least one error per patient encounter. At hospital discharge, medication errors vary from 15%–38%. However, studies assessing the effect of an internally developed electronic (E)-prescription system at discharge from an emergency department (ED) are comparatively minimal. Additionally, commercially available electronic solutions are cost-prohibitive in many resource-limited settings. We assessed the impact of introducing an internally developed, low-cost E-prescription system, with a list of commonly prescribed medications, on prescription error rates at discharge from the ED, compared to handwritten prescriptions. Methods We conducted a pre- and post-intervention study comparing error rates in a randomly selected sample of discharge prescriptions (handwritten versus electronic) five months pre and four months post the introduction of the E-prescription. The internally developed, E-prescription system included a list of 166 commonly prescribed medications with the generic name, strength, dose, frequency and duration. We included a total of 2,883 prescriptions in this study: 1,475 in the pre-intervention phase were handwritten (HW) and 1,408 in the post-intervention phase were electronic. We calculated rates of 14 different errors and compared them between the pre- and post-intervention period. Results Overall, E-prescriptions included fewer prescription errors as compared to HW-prescriptions. Specifically, E-prescriptions reduced missing dose (11.3% to 4.3%, p <0.0001), missing frequency (3.5% to 2.2%, p=0.04), missing strength errors (32.4% to 10.2%, p <0.0001) and legibility (0.7% to 0.2%, p=0.005). E-prescriptions, however, were associated with a significant increase in duplication errors, specifically with home medication (1.7% to 3%, p=0.02). Conclusion A basic, internally developed E-prescription system, featuring commonly used medications, effectively reduced medication errors in a low-resource setting where the costs of sophisticated commercial electronic solutions are prohibitive. PMID:28874948

  20. Impact of Internally Developed Electronic Prescription on Prescribing Errors at Discharge from the Emergency Department.

    PubMed

    Hitti, Eveline; Tamim, Hani; Bakhti, Rinad; Zebian, Dina; Mufarrij, Afif

    2017-08-01

    Medication errors are common, with studies reporting at least one error per patient encounter. At hospital discharge, medication errors vary from 15%-38%. However, studies assessing the effect of an internally developed electronic (E)-prescription system at discharge from an emergency department (ED) are comparatively minimal. Additionally, commercially available electronic solutions are cost-prohibitive in many resource-limited settings. We assessed the impact of introducing an internally developed, low-cost E-prescription system, with a list of commonly prescribed medications, on prescription error rates at discharge from the ED, compared to handwritten prescriptions. We conducted a pre- and post-intervention study comparing error rates in a randomly selected sample of discharge prescriptions (handwritten versus electronic) five months pre and four months post the introduction of the E-prescription. The internally developed, E-prescription system included a list of 166 commonly prescribed medications with the generic name, strength, dose, frequency and duration. We included a total of 2,883 prescriptions in this study: 1,475 in the pre-intervention phase were handwritten (HW) and 1,408 in the post-intervention phase were electronic. We calculated rates of 14 different errors and compared them between the pre- and post-intervention period. Overall, E-prescriptions included fewer prescription errors as compared to HW-prescriptions. Specifically, E-prescriptions reduced missing dose (11.3% to 4.3%, p <0.0001), missing frequency (3.5% to 2.2%, p=0.04), missing strength errors (32.4% to 10.2%, p <0.0001) and legibility (0.7% to 0.2%, p=0.005). E-prescriptions, however, were associated with a significant increase in duplication errors, specifically with home medication (1.7% to 3%, p=0.02). A basic, internally developed E-prescription system, featuring commonly used medications, effectively reduced medication errors in a low-resource setting where the costs of sophisticated commercial electronic solutions are prohibitive.

  1. Comparison of Methods for Estimating Low Flow Characteristics of Streams

    USGS Publications Warehouse

    Tasker, Gary D.

    1987-01-01

    Four methods for estimating the 7-day, 10-year and 7-day, 20-year low flows for streams are compared by the bootstrap method. The bootstrap method is a Monte Carlo technique in which random samples are drawn from an unspecified sampling distribution defined from observed data. The nonparametric nature of the bootstrap makes it suitable for comparing methods based on a flow series for which the true distribution is unknown. Results show that the two methods based on hypothetical distribution (Log-Pearson III and Weibull) had lower mean square errors than did the G. E. P. Box-D. R. Cox transformation method or the Log-W. C. Boughton method which is based on a fit of plotting positions.

  2. Gene-targeted Random Mutagenesis to Select Heterochromatin-destabilizing Proteasome Mutants in Fission Yeast.

    PubMed

    Seo, Hogyu David; Lee, Daeyoup

    2018-05-15

    Random mutagenesis of a target gene is commonly used to identify mutations that yield the desired phenotype. Of the methods that may be used to achieve random mutagenesis, error-prone PCR is a convenient and efficient strategy for generating a diverse pool of mutants (i.e., a mutant library). Error-prone PCR is the method of choice when a researcher seeks to mutate a pre-defined region, such as the coding region of a gene while leaving other genomic regions unaffected. After the mutant library is amplified by error-prone PCR, it must be cloned into a suitable plasmid. The size of the library generated by error-prone PCR is constrained by the efficiency of the cloning step. However, in the fission yeast, Schizosaccharomyces pombe, the cloning step can be replaced by the use of a highly efficient one-step fusion PCR to generate constructs for transformation. Mutants of desired phenotypes may then be selected using appropriate reporters. Here, we describe this strategy in detail, taking as an example, a reporter inserted at centromeric heterochromatin.

  3. Smooth empirical Bayes estimation of observation error variances in linear systems

    NASA Technical Reports Server (NTRS)

    Martz, H. F., Jr.; Lian, M. W.

    1972-01-01

    A smooth empirical Bayes estimator was developed for estimating the unknown random scale component of each of a set of observation error variances. It is shown that the estimator possesses a smaller average squared error loss than other estimators for a discrete time linear system.

  4. The accuracy of the measurements in Ulugh Beg's star catalogue

    NASA Astrophysics Data System (ADS)

    Krisciunas, K.

    1992-12-01

    The star catalogue compiled by Ulugh Beg and his collaborators in Samarkand (ca. 1437) is the only catalogue primarily based on original observations between the times of Ptolemy and Tycho Brahe. Evans (1987) has given convincing evidence that Ulugh Beg's star catalogue was based on measurements made with a zodiacal armillary sphere graduated to 15(') , with interpolation to 0.2 units. He and Shevchenko (1990) were primarily interested in the systematic errors in ecliptic longitude. Shevchenko's analysis of the random errors was limited to the twelve zodiacal constellations. We have analyzed all 843 ecliptic longitudes and latitudes attributed to Ulugh Beg by Knobel (1917). This required multiplying all the longitude errors by the respective values of the cosine of the celestial latitudes. We find a random error of +/- 17minp 7 for ecliptic longitude and +/- 16minp 5 for ecliptic latitude. On the whole, the random errors are largest near the ecliptic, decreasing towards the ecliptic poles. For all of Ulugh Beg's measurements (excluding outliers) the mean systematic error is -10minp 8 +/- 0minp 8 for ecliptic longitude and 7minp 5 +/- 0minp 7 for ecliptic latitude, with the errors in the sense ``computed minus Ulugh Beg''. For the brighter stars (those designated alpha , beta , and gamma in the respective constellations), the mean systematic errors are -11minp 3 +/- 1minp 9 for ecliptic longitude and 9minp 4 +/- 1minp 5 for ecliptic latitude. Within the errors this matches the systematic error in both coordinates for alpha Vir. With greater confidence we may conclude that alpha Vir was the principal reference star in the catalogues of Ulugh Beg and Ptolemy. Evans, J. 1987, J. Hist. Astr. 18, 155. Knobel, E. B. 1917, Ulugh Beg's Catalogue of Stars, Washington, D. C.: Carnegie Institution. Shevchenko, M. 1990, J. Hist. Astr. 21, 187.

  5. Forecasting Space Weather-Induced GPS Performance Degradation Using Random Forest

    NASA Astrophysics Data System (ADS)

    Filjar, R.; Filic, M.; Milinkovic, F.

    2017-12-01

    Space weather and ionospheric dynamics have a profound effect on positioning performance of the Global Satellite Navigation System (GNSS). However, the quantification of that effect is still the subject of scientific activities around the world. In the latest contribution to the understanding of the space weather and ionospheric effects on satellite-based positioning performance, we conducted a study of several candidates for forecasting method for space weather-induced GPS positioning performance deterioration. First, a 5-days set of experimentally collected data was established, encompassing the space weather and ionospheric activity indices (including: the readings of the Sudden Ionospheric Disturbance (SID) monitors, components of geomagnetic field strength, global Kp index, Dst index, GPS-derived Total Electron Content (TEC) samples, standard deviation of TEC samples, and sunspot number) and observations of GPS positioning error components (northing, easting, and height positioning error) derived from the Adriatic Sea IGS reference stations' RINEX raw pseudorange files in quiet space weather periods. This data set was split into the training and test sub-sets. Then, a selected set of supervised machine learning methods based on Random Forest was applied to the experimentally collected data set in order to establish the appropriate regional (the Adriatic Sea) forecasting models for space weather-induced GPS positioning performance deterioration. The forecasting models were developed in the R/rattle statistical programming environment. The forecasting quality of the regional forecasting models developed was assessed, and the conclusions drawn on the advantages and shortcomings of the regional forecasting models for space weather-caused GNSS positioning performance deterioration.

  6. Assessment of Visual Status of the Aeta, a Hunter-Gatherer Population of the Philippines (An AOS Thesis)

    PubMed Central

    Allingham, R. Rand

    2008-01-01

    Purpose A screening study was performed to assess levels of visual impairment and blindness among a representative sample of older members of the Aeta, an indigenous hunter-gatherer population living on the island of Luzon in the Philippines. Methods Unrelated older Aeta couples were randomly invited to participate in a visual screening study. All consented individuals had ocular history, medical history, complete ophthalmic examination, height, weight, and blood pressure taken. Results A total of 225 individuals were screened from 4 villages. Visual acuity, both uncorrected and pinhole corrected, was significantly worse among older vs younger age-groups for women, men, and when combined (P < .001). Visual impairment was present in 48% of uncorrected and 43% of pinhole corrected eyes in the oldest age-group. Six percent of the screened population was bilaterally blind. The major causes of blindness were readily treatable. The most common etiologies as a proportion of blind eyes were cataract (66%), refractive error (20%), and trauma (7%). No cases of primary open-angle, primary angle-closure, or exfoliation glaucoma were observed in this population. Discussion Visual impairment and blindness were common in the Aeta population. Primary forms of glaucoma, a major cause of blindness found in most population-based studies, were not observed. The absence of primary glaucoma in this population may reflect random sampling error. However, based on similar findings in the Australian Aborigine, this raises the possibility that these two similar populations may share genetic and/or environmental factors that are protective for glaucoma.. PMID:19277240

  7. Estimating current and future streamflow characteristics at ungaged sites, central and eastern Montana, with application to evaluating effects of climate change on fish populations

    USGS Publications Warehouse

    Sando, Roy; Chase, Katherine J.

    2017-03-23

    A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.

  8. Functional Mixed Effects Model for Small Area Estimation.

    PubMed

    Maiti, Tapabrata; Sinha, Samiran; Zhong, Ping-Shou

    2016-09-01

    Functional data analysis has become an important area of research due to its ability of handling high dimensional and complex data structures. However, the development is limited in the context of linear mixed effect models, and in particular, for small area estimation. The linear mixed effect models are the backbone of small area estimation. In this article, we consider area level data, and fit a varying coefficient linear mixed effect model where the varying coefficients are semi-parametrically modeled via B-splines. We propose a method of estimating the fixed effect parameters and consider prediction of random effects that can be implemented using a standard software. For measuring prediction uncertainties, we derive an analytical expression for the mean squared errors, and propose a method of estimating the mean squared errors. The procedure is illustrated via a real data example, and operating characteristics of the method are judged using finite sample simulation studies.

  9. Improved Equivalent Linearization Implementations Using Nonlinear Stiffness Evaluation

    NASA Technical Reports Server (NTRS)

    Rizzi, Stephen A.; Muravyov, Alexander A.

    2001-01-01

    This report documents two new implementations of equivalent linearization for solving geometrically nonlinear random vibration problems of complicated structures. The implementations are given the acronym ELSTEP, for "Equivalent Linearization using a STiffness Evaluation Procedure." Both implementations of ELSTEP are fundamentally the same in that they use a novel nonlinear stiffness evaluation procedure to numerically compute otherwise inaccessible nonlinear stiffness terms from commercial finite element programs. The commercial finite element program MSC/NASTRAN (NASTRAN) was chosen as the core of ELSTEP. The FORTRAN implementation calculates the nonlinear stiffness terms and performs the equivalent linearization analysis outside of NASTRAN. The Direct Matrix Abstraction Program (DMAP) implementation performs these operations within NASTRAN. Both provide nearly identical results. Within each implementation, two error minimization approaches for the equivalent linearization procedure are available - force and strain energy error minimization. Sample results for a simply supported rectangular plate are included to illustrate the analysis procedure.

  10. The Pitfalls of Thesaurus Ontologization – the Case of the NCI Thesaurus

    PubMed Central

    Schulz, Stefan; Schober, Daniel; Tudose, Ilinca; Stenzhorn, Holger

    2010-01-01

    Thesauri that are “ontologized” into OWL-DL semantics are highly amenable to modeling errors resulting from falsely interpreting existential restrictions. We investigated the OWL-DL representation of the NCI Thesaurus (NCIT) in order to assess the correctness of existential restrictions. A random sample of 354 axioms using the someValuesFrom operator was taken. According to a rating performed by two domain experts, roughly half of these examples, and in consequence more than 76,000 axioms in the OWL-DL version, make incorrect assertions if interpreted according to description logics semantics. These axioms therefore constitute a huge source for unintended models, rendering most logic-based reasoning unreliable. After identifying typical error patterns we discuss some possible improvements. Our recommendation is to either amend the problematic axioms in the OWL-DL formalization or to consider some less strict representational format. PMID:21347074

  11. Satellite Sampling and Retrieval Errors in Regional Monthly Rain Estimates from TMI AMSR-E, SSM/I, AMSU-B and the TRMM PR

    NASA Technical Reports Server (NTRS)

    Fisher, Brad; Wolff, David B.

    2010-01-01

    Passive and active microwave rain sensors onboard earth-orbiting satellites estimate monthly rainfall from the instantaneous rain statistics collected during satellite overpasses. It is well known that climate-scale rain estimates from meteorological satellites incur sampling errors resulting from the process of discrete temporal sampling and statistical averaging. Sampling and retrieval errors ultimately become entangled in the estimation of the mean monthly rain rate. The sampling component of the error budget effectively introduces statistical noise into climate-scale rain estimates that obscure the error component associated with the instantaneous rain retrieval. Estimating the accuracy of the retrievals on monthly scales therefore necessitates a decomposition of the total error budget into sampling and retrieval error quantities. This paper presents results from a statistical evaluation of the sampling and retrieval errors for five different space-borne rain sensors on board nine orbiting satellites. Using an error decomposition methodology developed by one of the authors, sampling and retrieval errors were estimated at 0.25 resolution within 150 km of ground-based weather radars located at Kwajalein, Marshall Islands and Melbourne, Florida. Error and bias statistics were calculated according to the land, ocean and coast classifications of the surface terrain mask developed for the Goddard Profiling (GPROF) rain algorithm. Variations in the comparative error statistics are attributed to various factors related to differences in the swath geometry of each rain sensor, the orbital and instrument characteristics of the satellite and the regional climatology. The most significant result from this study found that each of the satellites incurred negative longterm oceanic retrieval biases of 10 to 30%.

  12. Estimation of genetic connectedness diagnostics based on prediction errors without the prediction error variance-covariance matrix.

    PubMed

    Holmes, John B; Dodds, Ken G; Lee, Michael A

    2017-03-02

    An important issue in genetic evaluation is the comparability of random effects (breeding values), particularly between pairs of animals in different contemporary groups. This is usually referred to as genetic connectedness. While various measures of connectedness have been proposed in the literature, there is general agreement that the most appropriate measure is some function of the prediction error variance-covariance matrix. However, obtaining the prediction error variance-covariance matrix is computationally demanding for large-scale genetic evaluations. Many alternative statistics have been proposed that avoid the computational cost of obtaining the prediction error variance-covariance matrix, such as counts of genetic links between contemporary groups, gene flow matrices, and functions of the variance-covariance matrix of estimated contemporary group fixed effects. In this paper, we show that a correction to the variance-covariance matrix of estimated contemporary group fixed effects will produce the exact prediction error variance-covariance matrix averaged by contemporary group for univariate models in the presence of single or multiple fixed effects and one random effect. We demonstrate the correction for a series of models and show that approximations to the prediction error matrix based solely on the variance-covariance matrix of estimated contemporary group fixed effects are inappropriate in certain circumstances. Our method allows for the calculation of a connectedness measure based on the prediction error variance-covariance matrix by calculating only the variance-covariance matrix of estimated fixed effects. Since the number of fixed effects in genetic evaluation is usually orders of magnitudes smaller than the number of random effect levels, the computational requirements for our method should be reduced.

  13. Prevalence and types of preanalytical error in hematology laboratory of a tertiary care hospital in South India.

    PubMed

    Arul, Pitchaikaran; Pushparaj, Magesh; Pandian, Kanmani; Chennimalai, Lingasamy; Rajendran, Karthika; Selvaraj, Eniya; Masilamani, Suresh

    2018-01-01

    An important component of laboratory medicine is preanalytical phase. Since laboratory report plays a major role in patient management, more importance should be given to the quality of laboratory tests. The present study was undertaken to find the prevalence and types of preanalytical errors at a tertiary care hospital in South India. In this cross-sectional study, a total of 118,732 samples ([62,474 outpatient department [OPD] and 56,258 inpatient department [IPD]) were received in hematology laboratory. These samples were analyzed for preanalytical errors such as misidentification, incorrect vials, inadequate samples, clotted samples, diluted samples, and hemolyzed samples. The overall prevalence of preanalytical errors found was 513 samples, which is 0.43% of the total number of samples received. The most common preanalytical error observed was inadequate samples followed by clotted samples. Overall frequencies (both OPD and IPD) of preanalytical errors such as misidentification, incorrect vials, inadequate samples, clotted samples, diluted samples, and hemolyzed samples were 0.02%, 0.05%, 0.2%, 0.12%, 0.02%, and 0.03%, respectively. The present study concluded that incorrect phlebotomy techniques due to lack of awareness is the main reason for preanalytical errors. This can be avoided by proper communication and coordination between laboratory and wards, proper training and continuing medical education programs for laboratory and paramedical staffs, and knowledge of the intervening factors that can influence laboratory results.

  14. Design and analysis of group-randomized trials in cancer: A review of current practices.

    PubMed

    Murray, David M; Pals, Sherri L; George, Stephanie M; Kuzmichev, Andrey; Lai, Gabriel Y; Lee, Jocelyn A; Myles, Ranell L; Nelson, Shakira M

    2018-06-01

    The purpose of this paper is to summarize current practices for the design and analysis of group-randomized trials involving cancer-related risk factors or outcomes and to offer recommendations to improve future trials. We searched for group-randomized trials involving cancer-related risk factors or outcomes that were published or online in peer-reviewed journals in 2011-15. During 2016-17, in Bethesda MD, we reviewed 123 articles from 76 journals to characterize their design and their methods for sample size estimation and data analysis. Only 66 (53.7%) of the articles reported appropriate methods for sample size estimation. Only 63 (51.2%) reported exclusively appropriate methods for analysis. These findings suggest that many investigators do not adequately attend to the methodological challenges inherent in group-randomized trials. These practices can lead to underpowered studies, to an inflated type 1 error rate, and to inferences that mislead readers. Investigators should work with biostatisticians or other methodologists familiar with these issues. Funders and editors should ensure careful methodological review of applications and manuscripts. Reviewers should ensure that studies are properly planned and analyzed. These steps are needed to improve the rigor and reproducibility of group-randomized trials. The Office of Disease Prevention (ODP) at the National Institutes of Health (NIH) has taken several steps to address these issues. ODP offers an online course on the design and analysis of group-randomized trials. ODP is working to increase the number of methodologists who serve on grant review panels. ODP has developed standard language for the Application Guide and the Review Criteria to draw investigators' attention to these issues. Finally, ODP has created a new Research Methods Resources website to help investigators, reviewers, and NIH staff better understand these issues. Published by Elsevier Inc.

  15. Sampling Methods for Detection and Monitoring of the Asian Citrus Psyllid (Hemiptera: Psyllidae).

    PubMed

    Monzo, C; Arevalo, H A; Jones, M M; Vanaclocha, P; Croxton, S D; Qureshi, J A; Stansly, P A

    2015-06-01

    The Asian citrus psyllid (ACP), Diaphorina citri Kuwayama is a key pest of citrus due to its role as vector of citrus greening disease or "huanglongbing." ACP monitoring is considered an indispensable tool for management of vector and disease. In the present study, datasets collected between 2009 and 2013 from 245 citrus blocks were used to evaluate precision, sensitivity for detection, and efficiency of five sampling methods. The number of samples needed to reach a 0.25 standard error-mean ratio was estimated using Taylor's power law and used to compare precision among sampling methods. Comparison of detection sensitivity and time expenditure (cost) between stem-tap and other sampling methodologies conducted consecutively at the same location were also assessed. Stem-tap sampling was the most efficient sampling method when ACP densities were moderate to high and served as the basis for comparison with all other methods. Protocols that grouped trees near randomly selected locations across the block were more efficient than sampling trees at random across the block. Sweep net sampling was similar to stem-taps in number of captures per sampled unit, but less precise at any ACP density. Yellow sticky traps were 14 times more sensitive than stem-taps but much more time consuming and thus less efficient except at very low population densities. Visual sampling was efficient for detecting and monitoring ACP at low densities. Suction sampling was time consuming and taxing but the most sensitive of all methods for detection of sparse populations. This information can be used to optimize ACP monitoring efforts. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. Random access to mobile networks with advanced error correction

    NASA Technical Reports Server (NTRS)

    Dippold, Michael

    1990-01-01

    A random access scheme for unreliable data channels is investigated in conjunction with an adaptive Hybrid-II Automatic Repeat Request (ARQ) scheme using Rate Compatible Punctured Codes (RCPC) Forward Error Correction (FEC). A simple scheme with fixed frame length and equal slot sizes is chosen and reservation is implicit by the first packet transmitted randomly in a free slot, similar to Reservation Aloha. This allows the further transmission of redundancy if the last decoding attempt failed. Results show that a high channel utilization and superior throughput can be achieved with this scheme that shows a quite low implementation complexity. For the example of an interleaved Rayleigh channel and soft decision utilization and mean delay are calculated. A utilization of 40 percent may be achieved for a frame with the number of slots being equal to half the station number under high traffic load. The effects of feedback channel errors and some countermeasures are discussed.

  17. Exposure assessment models for elemental components of particulate matter in an urban environment: A comparison of regression and random forest approaches

    NASA Astrophysics Data System (ADS)

    Brokamp, Cole; Jandarov, Roman; Rao, M. B.; LeMasters, Grace; Ryan, Patrick

    2017-02-01

    Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.

  18. Exposure assessment models for elemental components of particulate matter in an urban environment: A comparison of regression and random forest approaches.

    PubMed

    Brokamp, Cole; Jandarov, Roman; Rao, M B; LeMasters, Grace; Ryan, Patrick

    2017-02-01

    Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.

  19. Exposure assessment models for elemental components of particulate matter in an urban environment: A comparison of regression and random forest approaches

    PubMed Central

    Brokamp, Cole; Jandarov, Roman; Rao, M.B.; LeMasters, Grace; Ryan, Patrick

    2017-01-01

    Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment. PMID:28959135

  20. Physical Validation of TRMM TMI and PR Monthly Rain Products Over Oklahoma

    NASA Technical Reports Server (NTRS)

    Fisher, Brad L.

    2004-01-01

    The Tropical Rainfall Measuring Mission (TRMM) provides monthly rainfall estimates using data collected by the TRMM satellite. These estimates cover a substantial fraction of the earth's surface. The physical validation of TRMM estimates involves corroborating the accuracy of spaceborne estimates of areal rainfall by inferring errors and biases from ground-based rain estimates. The TRMM error budget consists of two major sources of error: retrieval and sampling. Sampling errors are intrinsic to the process of estimating monthly rainfall and occur because the satellite extrapolates monthly rainfall from a small subset of measurements collected only during satellite overpasses. Retrieval errors, on the other hand, are related to the process of collecting measurements while the satellite is overhead. One of the big challenges confronting the TRMM validation effort is how to best estimate these two main components of the TRMM error budget, which are not easily decoupled. This four-year study computed bulk sampling and retrieval errors for the TRMM microwave imager (TMI) and the precipitation radar (PR) by applying a technique that sub-samples gauge data at TRMM overpass times. Gridded monthly rain estimates are then computed from the monthly bulk statistics of the collected samples, providing a sensor-dependent gauge rain estimate that is assumed to include a TRMM equivalent sampling error. The sub-sampled gauge rain estimates are then used in conjunction with the monthly satellite and gauge (without sub- sampling) estimates to decouple retrieval and sampling errors. The computed mean sampling errors for the TMI and PR were 5.9% and 7.796, respectively, in good agreement with theoretical predictions. The PR year-to-year retrieval biases exceeded corresponding TMI biases, but it was found that these differences were partially due to negative TMI biases during cold months and positive TMI biases during warm months.

  1. Quantifying Errors in TRMM-Based Multi-Sensor QPE Products Over Land in Preparation for GPM

    NASA Technical Reports Server (NTRS)

    Peters-Lidard, Christa D.; Tian, Yudong

    2011-01-01

    Determining uncertainties in satellite-based multi-sensor quantitative precipitation estimates over land of fundamental importance to both data producers and hydro climatological applications. ,Evaluating TRMM-era products also lays the groundwork and sets the direction for algorithm and applications development for future missions including GPM. QPE uncertainties result mostly from the interplay of systematic errors and random errors. In this work, we will synthesize our recent results quantifying the error characteristics of satellite-based precipitation estimates. Both systematic errors and total uncertainties have been analyzed for six different TRMM-era precipitation products (3B42, 3B42RT, CMORPH, PERSIANN, NRL and GSMap). For systematic errors, we devised an error decomposition scheme to separate errors in precipitation estimates into three independent components, hit biases, missed precipitation and false precipitation. This decomposition scheme reveals hydroclimatologically-relevant error features and provides a better link to the error sources than conventional analysis, because in the latter these error components tend to cancel one another when aggregated or averaged in space or time. For the random errors, we calculated the measurement spread from the ensemble of these six quasi-independent products, and thus produced a global map of measurement uncertainties. The map yields a global view of the error characteristics and their regional and seasonal variations, reveals many undocumented error features over areas with no validation data available, and provides better guidance to global assimilation of satellite-based precipitation data. Insights gained from these results and how they could help with GPM will be highlighted.

  2. Sloppy-slotted ALOHA

    NASA Technical Reports Server (NTRS)

    Crozier, Stewart N.

    1990-01-01

    Random access signaling, which allows slotted packets to spill over into adjacent slots, is investigated. It is shown that sloppy-slotted ALOHA can always provide higher throughput than conventional slotted ALOHA. The degree of improvement depends on the timing error distribution. Throughput performance is presented for Gaussian timing error distributions, modified to include timing error corrections. A general channel capacity lower bound, independent of the specific timing error distribution, is also presented.

  3. A general method for the definition of margin recipes depending on the treatment technique applied in helical tomotherapy prostate plans.

    PubMed

    Sevillano, David; Mínguez, Cristina; Sánchez, Alicia; Sánchez-Reyes, Alberto

    2016-01-01

    To obtain specific margin recipes that take into account the dosimetric characteristics of the treatment plans used in a single institution. We obtained dose-population histograms (DPHs) of 20 helical tomotherapy treatment plans for prostate cancer by simulating the effects of different systematic errors (Σ) and random errors (σ) on these plans. We obtained dosimetric margins and margin reductions due to random errors (random margins) by fitting the theoretical results of coverages for Gaussian distributions with coverages of the planned D99% obtained from the DPHs. The dosimetric margins obtained for helical tomotherapy prostate treatments were 3.3 mm, 3 mm, and 1 mm in the lateral (Lat), anterior-posterior (AP), and superior-inferior (SI) directions. Random margins showed parabolic dependencies, yielding expressions of 0.16σ(2), 0.13σ(2), and 0.15σ(2) for the Lat, AP, and SI directions, respectively. When focusing on values up to σ = 5 mm, random margins could be fitted considering Gaussian penumbras with standard deviations (σp) equal to 4.5 mm Lat, 6 mm AP, and 5.5 mm SI. Despite complex dose distributions in helical tomotherapy treatment plans, we were able to simplify the behaviour of our plans against treatment errors to single values of dosimetric and random margins for each direction. These margins allowed us to develop specific margin recipes for the respective treatment technique. The method is general and could be used for any treatment technique provided that DPHs can be obtained. Copyright © 2015 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  4. Model studies of the beam-filling error for rain-rate retrieval with microwave radiometers

    NASA Technical Reports Server (NTRS)

    Ha, Eunho; North, Gerald R.

    1995-01-01

    Low-frequency (less than 20 GHz) single-channel microwave retrievals of rain rate encounter the problem of beam-filling error. This error stems from the fact that the relationship between microwave brightness temperature and rain rate is nonlinear, coupled with the fact that the field of view is large or comparable to important scales of variability of the rain field. This means that one may not simply insert the area average of the brightness temperature into the formula for rain rate without incurring both bias and random error. The statistical heterogeneity of the rain-rate field in the footprint of the instrument is key to determining the nature of these errors. This paper makes use of a series of random rain-rate fields to study the size of the bias and random error associated with beam filling. A number of examples are analyzed in detail: the binomially distributed field, the gamma, the Gaussian, the mixed gamma, the lognormal, and the mixed lognormal ('mixed' here means there is a finite probability of no rain rate at a point of space-time). Of particular interest are the applicability of a simple error formula due to Chiu and collaborators and a formula that might hold in the large field of view limit. It is found that the simple formula holds for Gaussian rain-rate fields but begins to fail for highly skewed fields such as the mixed lognormal. While not conclusively demonstrated here, it is suggested that the notionof climatologically adjusting the retrievals to remove the beam-filling bias is a reasonable proposition.

  5. Technical Note: Millimeter precision in ultrasound based patient positioning: Experimental quantification of inherent technical limitations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ballhausen, Hendrik, E-mail: hendrik.ballhausen@med.uni-muenchen.de; Hieber, Sheila; Li, Minglun

    2014-08-15

    Purpose: To identify the relevant technical sources of error of a system based on three-dimensional ultrasound (3D US) for patient positioning in external beam radiotherapy. To quantify these sources of error in a controlled laboratory setting. To estimate the resulting end-to-end geometric precision of the intramodality protocol. Methods: Two identical free-hand 3D US systems at both the planning-CT and the treatment room were calibrated to the laboratory frame of reference. Every step of the calibration chain was repeated multiple times to estimate its contribution to overall systematic and random error. Optimal margins were computed given the identified and quantified systematicmore » and random errors. Results: In descending order of magnitude, the identified and quantified sources of error were: alignment of calibration phantom to laser marks 0.78 mm, alignment of lasers in treatment vs planning room 0.51 mm, calibration and tracking of 3D US probe 0.49 mm, alignment of stereoscopic infrared camera to calibration phantom 0.03 mm. Under ideal laboratory conditions, these errors are expected to limit ultrasound-based positioning to an accuracy of 1.05 mm radially. Conclusions: The investigated 3D ultrasound system achieves an intramodal accuracy of about 1 mm radially in a controlled laboratory setting. The identified systematic and random errors require an optimal clinical tumor volume to planning target volume margin of about 3 mm. These inherent technical limitations do not prevent clinical use, including hypofractionation or stereotactic body radiation therapy.« less

  6. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference

    PubMed Central

    Maples, Brian K.; Gravel, Simon; Kenny, Eimear E.; Bustamante, Carlos D.

    2013-01-01

    Local-ancestry inference is an important step in the genetic analysis of fully sequenced human genomes. Current methods can only detect continental-level ancestry (i.e., European versus African versus Asian) accurately even when using millions of markers. Here, we present RFMix, a powerful discriminative modeling approach that is faster (∼30×) and more accurate than existing methods. We accomplish this by using a conditional random field parameterized by random forests trained on reference panels. RFMix is capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors. RFMix shows high sensitivity and specificity in simulated Hispanics/Latinos and African Americans and admixed Europeans, Africans, and Asians. Finally, we demonstrate that African Americans in HapMap contain modest (but nonzero) levels of Native American ancestry (∼0.4%). PMID:23910464

  7. Preanalytical Errors in Hematology Laboratory- an Avoidable Incompetence.

    PubMed

    HarsimranKaur, Vikram Narang; Selhi, Pavneet Kaur; Sood, Neena; Singh, Aminder

    2016-01-01

    Quality assurance in the hematology laboratory is a must to ensure laboratory users of reliable test results with high degree of precision and accuracy. Even after so many advances in hematology laboratory practice, pre-analytical errors remain a challenge for practicing pathologists. This study was undertaken with an objective to evaluate the types and frequency of preanalytical errors in hematology laboratory of our center. All the samples received in the Hematology Laboratory of Dayanand Medical College and Hospital, Ludhiana, India over a period of one year (July 2013-July 2014) were included in the study and preanalytical variables like clotted samples, quantity not sufficient, wrong sample, without label, wrong label were studied. Of 471,006 samples received in the laboratory, preanalytical errors, as per the above mentioned categories was found in 1802 samples. The most common error was clotted samples (1332 samples, 0.28% of the total samples) followed by quantity not sufficient (328 sample, 0.06%), wrong sample (96 samples, 0.02%), without label (24 samples, 0.005%) and wrong label (22 samples, 0.005%). Preanalytical errors are frequent in laboratories and can be corrected by regular analysis of the variables involved. Rectification can be done by regular education of the staff.

  8. Ranked set sampling: cost and optimal set size.

    PubMed

    Nahhas, Ramzi W; Wolfe, Douglas A; Chen, Haiying

    2002-12-01

    McIntyre (1952, Australian Journal of Agricultural Research 3, 385-390) introduced ranked set sampling (RSS) as a method for improving estimation of a population mean in settings where sampling and ranking of units from the population are inexpensive when compared with actual measurement of the units. Two of the major factors in the usefulness of RSS are the set size and the relative costs of the various operations of sampling, ranking, and measurement. In this article, we consider ranking error models and cost models that enable us to assess the effect of different cost structures on the optimal set size for RSS. For reasonable cost structures, we find that the optimal RSS set sizes are generally larger than had been anticipated previously. These results will provide a useful tool for determining whether RSS is likely to lead to an improvement over simple random sampling in a given setting and, if so, what RSS set size is best to use in this case.

  9. Exploring the impact of forcing error characteristics on physically based snow simulations within a global sensitivity analysis framework

    NASA Astrophysics Data System (ADS)

    Raleigh, M. S.; Lundquist, J. D.; Clark, M. P.

    2015-07-01

    Physically based models provide insights into key hydrologic processes but are associated with uncertainties due to deficiencies in forcing data, model parameters, and model structure. Forcing uncertainty is enhanced in snow-affected catchments, where weather stations are scarce and prone to measurement errors, and meteorological variables exhibit high variability. Hence, there is limited understanding of how forcing error characteristics affect simulations of cold region hydrology and which error characteristics are most important. Here we employ global sensitivity analysis to explore how (1) different error types (i.e., bias, random errors), (2) different error probability distributions, and (3) different error magnitudes influence physically based simulations of four snow variables (snow water equivalent, ablation rates, snow disappearance, and sublimation). We use the Sobol' global sensitivity analysis, which is typically used for model parameters but adapted here for testing model sensitivity to coexisting errors in all forcings. We quantify the Utah Energy Balance model's sensitivity to forcing errors with 1 840 000 Monte Carlo simulations across four sites and five different scenarios. Model outputs were (1) consistently more sensitive to forcing biases than random errors, (2) generally less sensitive to forcing error distributions, and (3) critically sensitive to different forcings depending on the relative magnitude of errors. For typical error magnitudes found in areas with drifting snow, precipitation bias was the most important factor for snow water equivalent, ablation rates, and snow disappearance timing, but other forcings had a more dominant impact when precipitation uncertainty was due solely to gauge undercatch. Additionally, the relative importance of forcing errors depended on the model output of interest. Sensitivity analysis can reveal which forcing error characteristics matter most for hydrologic modeling.

  10. Effect of Random Circuit Fabrication Errors on Small Signal Gain and Phase in Helix Traveling Wave Tubes

    NASA Astrophysics Data System (ADS)

    Pengvanich, P.; Chernin, D. P.; Lau, Y. Y.; Luginsland, J. W.; Gilgenbach, R. M.

    2007-11-01

    Motivated by the current interest in mm-wave and THz sources, which use miniature, difficult-to-fabricate circuit components, we evaluate the statistical effects of random fabrication errors on a helix traveling wave tube amplifier's small signal characteristics. The small signal theory is treated in a continuum model in which the electron beam is assumed to be monoenergetic, and axially symmetric about the helix axis. Perturbations that vary randomly along the beam axis are introduced in the dimensionless Pierce parameters b, the beam-wave velocity mismatch, C, the gain parameter, and d, the cold tube circuit loss. Our study shows, as expected, that perturbation in b dominates the other two. The extensive numerical data have been confirmed by our analytic theory. They show in particular that the standard deviation of the output phase is linearly proportional to standard deviation of the individual perturbations in b, C, and d. Simple formulas have been derived which yield the output phase variations in terms of the statistical random manufacturing errors. This work was supported by AFOSR and by ONR.

  11. Optimization of planar PIV-based pressure estimates in laminar and turbulent wakes

    NASA Astrophysics Data System (ADS)

    McClure, Jeffrey; Yarusevych, Serhiy

    2017-05-01

    The performance of four pressure estimation techniques using Eulerian material acceleration estimates from planar, two-component Particle Image Velocimetry (PIV) data were evaluated in a bluff body wake. To allow for the ground truth comparison of the pressure estimates, direct numerical simulations of flow over a circular cylinder were used to obtain synthetic velocity fields. Direct numerical simulations were performed for Re_D = 100, 300, and 1575, spanning laminar, transitional, and turbulent wake regimes, respectively. A parametric study encompassing a range of temporal and spatial resolutions was performed for each Re_D. The effect of random noise typical of experimental velocity measurements was also evaluated. The results identified optimal temporal and spatial resolutions that minimize the propagation of random and truncation errors to the pressure field estimates. A model derived from linear error propagation through the material acceleration central difference estimators was developed to predict these optima, and showed good agreement with the results from common pressure estimation techniques. The results of the model are also shown to provide acceptable first-order approximations for sampling parameters that reduce error propagation when Lagrangian estimations of material acceleration are employed. For pressure integration based on planar PIV, the effect of flow three-dimensionality was also quantified, and shown to be most pronounced at higher Reynolds numbers downstream of the vortex formation region, where dominant vortices undergo substantial three-dimensional deformations. The results of the present study provide a priori recommendations for the use of pressure estimation techniques from experimental PIV measurements in vortex dominated laminar and turbulent wake flows.

  12. Accuracy of indirect estimation of power output from uphill performance in cycling.

    PubMed

    Millet, Grégoire P; Tronche, Cyrille; Grappe, Frédéric

    2014-09-01

    To use measurement by cycling power meters (Pmes) to evaluate the accuracy of commonly used models for estimating uphill cycling power (Pest). Experiments were designed to explore the influence of wind speed and steepness of climb on accuracy of Pest. The authors hypothesized that the random error in Pest would be largely influenced by the windy conditions, the bias would be diminished in steeper climbs, and windy conditions would induce larger bias in Pest. Sixteen well-trained cyclists performed 15 uphill-cycling trials (range: length 1.3-6.3 km, slope 4.4-10.7%) in a random order. Trials included different riding position in a group (lead or follow) and different wind speeds. Pmes was quantified using a power meter, and Pest was calculated with a methodology used by journalists reporting on the Tour de France. Overall, the difference between Pmes and Pest was -0.95% (95%CI: -10.4%, +8.5%) for all trials and 0.24% (-6.1%, +6.6%) in conditions without wind (<2 m/s). The relationship between percent slope and the error between Pest and Pmes were considered trivial. Aerodynamic drag (affected by wind velocity and orientation, frontal area, drafting, and speed) is the most confounding factor. The mean estimated values are close to the power-output values measured by power meters, but the random error is between ±6% and ±10%. Moreover, at the power outputs (>400 W) produced by professional riders, this error is likely to be higher. This observation calls into question the validity of releasing individual values without reporting the range of random errors.

  13. The use of mini-samples in palaeomagnetism

    NASA Astrophysics Data System (ADS)

    Böhnel, Harald; Michalk, Daniel; Nowaczyk, Norbert; Naranjo, Gildardo Gonzalez

    2009-10-01

    Rock cores of ~25 mm diameter are widely used in palaeomagnetism. Occasionally smaller diameters have been used as well which represents distinct advantages in terms of throughput, weight of equipment and core collections. How their orientation precision compares to 25 mm cores, however, has not been evaluated in detail before. Here we compare the site mean directions and their statistical parameters for 12 lava flows sampled with 25 mm cores (standard samples, typically 8 cores per site) and with 12 mm drill cores (mini-samples, typically 14 cores per site). The site-mean directions for both sample sizes appear to be indistinguishable in most cases. For the mini-samples, site dispersion parameters k on average are slightly lower than for the standard samples reflecting their larger orienting and measurement errors. Applying the Wilcoxon signed-rank test the probability that k or α95 have the same distribution for both sizes is acceptable only at the 17.4 or 66.3 per cent level, respectively. The larger mini-core numbers per site appears to outweigh the lower k values yielding also slightly smaller confidence limits α95. Further, both k and α95 are less variable for mini-samples than for standard size samples. This is interpreted also to result from the larger number of mini-samples per site, which better averages out the detrimental effect of undetected abnormal remanence directions. Sampling of volcanic rocks with mini-samples therefore does not present a disadvantage in terms of the overall obtainable uncertainty of site mean directions. Apart from this, mini-samples do present clear advantages during the field work, as about twice the number of drill cores can be recovered compared to 25 mm cores, and the sampled rock unit is then more widely covered, which reduces the contribution of natural random errors produced, for example, by fractures, cooling joints, and palaeofield inhomogeneities. Mini-samples may be processed faster in the laboratory, which is of particular advantage when carrying out palaeointensity experiments.

  14. Derivation and Application of a Global Albedo yielding an Optical Brightness To Physical Size Transformation Free of Systematic Errors

    NASA Technical Reports Server (NTRS)

    Mulrooney, Dr. Mark K.; Matney, Dr. Mark J.

    2007-01-01

    Orbital object data acquired via optical telescopes can play a crucial role in accurately defining the space environment. Radar systems probe the characteristics of small debris by measuring the reflected electromagnetic energy from an object of the same order of size as the wavelength of the radiation. This signal is affected by electrical conductivity of the bulk of the debris object, as well as its shape and orientation. Optical measurements use reflected solar radiation with wavelengths much smaller than the size of the objects. Just as with radar, the shape and orientation of an object are important, but we only need to consider the surface electrical properties of the debris material (i.e., the surface albedo), not the bulk electromagnetic properties. As a result, these two methods are complementary in that they measure somewhat independent physical properties to estimate the same thing, debris size. Short arc optical observations such as are typical of NASA's Liquid Mirror Telescope (LMT) give enough information to estimate an Assumed Circular Orbit (ACO) and an associated range. This information, combined with the apparent magnitude, can be used to estimate an "absolute" brightness (scaled to a fixed range and phase angle). This absolute magnitude is what is used to estimate debris size. However, the shape and surface albedo effects make the size estimates subject to systematic and random errors, such that it is impossible to ascertain the size of an individual object with any certainty. However, as has been shown with radar debris measurements, that does not preclude the ability to estimate the size distribution of a number of objects statistically. After systematic errors have been eliminated (range errors, phase function assumptions, photometry) there remains a random geometric albedo distribution that relates object size to absolute magnitude. Measurements by the LMT of a subset of tracked debris objects with sizes estimated from their radar cross sections indicate that the random variations in the albedo follow a log-normal distribution quite well. In addition, this distribution appears to be independent of object size over a considerable range in size. Note that this relation appears to hold for debris only, where the shapes and other properties are not primarily the result of human manufacture, but of random processes. With this information in hand, it now becomes possible to estimate the actual size distribution we are sampling from. We have identified two characteristics of the space debris population that make this process tractable and by extension have developed a methodology for performing the transformation.

  15. Using regression methods to estimate stream phosphorus loads at the Illinois River, Arkansas

    USGS Publications Warehouse

    Haggard, B.E.; Soerens, T.S.; Green, W.R.; Richards, R.P.

    2003-01-01

    The development of total maximum daily loads (TMDLs) requires evaluating existing constituent loads in streams. Accurate estimates of constituent loads are needed to calibrate watershed and reservoir models for TMDL development. The best approach to estimate constituent loads is high frequency sampling, particularly during storm events, and mass integration of constituents passing a point in a stream. Most often, resources are limited and discrete water quality samples are collected on fixed intervals and sometimes supplemented with directed sampling during storm events. When resources are limited, mass integration is not an accurate means to determine constituent loads and other load estimation techniques such as regression models are used. The objective of this work was to determine a minimum number of water-quality samples needed to provide constituent concentration data adequate to estimate constituent loads at a large stream. Twenty sets of water quality samples with and without supplemental storm samples were randomly selected at various fixed intervals from a database at the Illinois River, northwest Arkansas. The random sets were used to estimate total phosphorus (TP) loads using regression models. The regression-based annual TP loads were compared to the integrated annual TP load estimated using all the data. At a minimum, monthly sampling plus supplemental storm samples (six samples per year) was needed to produce a root mean square error of less than 15%. Water quality samples should be collected at least semi-monthly (every 15 days) in studies less than two years if seasonal time factors are to be used in the regression models. Annual TP loads estimated from independently collected discrete water quality samples further demonstrated the utility of using regression models to estimate annual TP loads in this stream system.

  16. Evaluating mixed samples as a source of error in non-invasive genetic studies using microsatellites

    USGS Publications Warehouse

    Roon, David A.; Thomas, M.E.; Kendall, K.C.; Waits, L.P.

    2005-01-01

    The use of noninvasive genetic sampling (NGS) for surveying wild populations is increasing rapidly. Currently, only a limited number of studies have evaluated potential biases associated with NGS. This paper evaluates the potential errors associated with analysing mixed samples drawn from multiple animals. Most NGS studies assume that mixed samples will be identified and removed during the genotyping process. We evaluated this assumption by creating 128 mixed samples of extracted DNA from brown bear (Ursus arctos) hair samples. These mixed samples were genotyped and screened for errors at six microsatellite loci according to protocols consistent with those used in other NGS studies. Five mixed samples produced acceptable genotypes after the first screening. However, all mixed samples produced multiple alleles at one or more loci, amplified as only one of the source samples, or yielded inconsistent electropherograms by the final stage of the error-checking process. These processes could potentially reduce the number of individuals observed in NGS studies, but errors should be conservative within demographic estimates. Researchers should be aware of the potential for mixed samples and carefully design gel analysis criteria and error checking protocols to detect mixed samples.

  17. Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias

    PubMed Central

    Chambers, David A.; Glasgow, Russell E.

    2014-01-01

    Abstract A number of commentaries have suggested that large studies are more reliable than smaller studies and there is a growing interest in the analysis of “big data” that integrates information from many thousands of persons and/or different data sources. We consider a variety of biases that are likely in the era of big data, including sampling error, measurement error, multiple comparisons errors, aggregation error, and errors associated with the systematic exclusion of information. Using examples from epidemiology, health services research, studies on determinants of health, and clinical trials, we conclude that it is necessary to exercise greater caution to be sure that big sample size does not lead to big inferential errors. Despite the advantages of big studies, large sample size can magnify the bias associated with error resulting from sampling or study design. Clin Trans Sci 2014; Volume #: 1–5 PMID:25043853

  18. Refractive error and presbyopia among adults in Fiji.

    PubMed

    Brian, Garry; Pearce, Matthew G; Ramke, Jacqueline

    2011-04-01

    To characterize refractive error, presbyopia and their correction among adults aged ≥ 40 years in Fiji, and contribute to a regional overview of these conditions. A population-based cross-sectional survey using multistage cluster random sampling. Presenting distance and near vision were measured and dilated slitlamp examination performed. The survey achieved 73.0% participation (n=1381). Presenting binocular distance vision ≥ 6/18 was achieved by 1223 participants. Another 79 had vision impaired by refractive error. Three of these were blind. At threshold 6/18, 204 participants had refractive error. Among these, 125 had spectacle-corrected presenting vision ≥ 6/18 ("met refractive error need"); 79 presented wearing no (n=74) or under-correcting (n=5) distance spectacles ("unmet refractive error need"). Presenting binocular near vision ≥ N8 was achieved by 833 participants. At threshold N8, 811 participants had presbyopia. Among these, 336 attained N8 with presenting near spectacles ("met presbyopia need"); 475 presented with no (n=402) or under-correcting (n=73) near spectacles ("unmet presbyopia need"). Rural residence was predictive of unmet refractive error (p=0.040) and presbyopia (p=0.016) need. Gender and household income source were not. Ethnicity-gender-age-domicile-adjusted to the Fiji population aged ≥ 40 years, "met refractive error need" was 10.3% (95% confidence interval [CI] 8.7-11.9%), "unmet refractive error need" was 4.8% (95%CI 3.6-5.9%), "refractive error correction coverage" was 68.3% (95%CI 54.4-82.2%),"met presbyopia need" was 24.6% (95%CI 22.4-26.9%), "unmet presbyopia need" was 33.8% (95%CI 31.3-36.3%), and "presbyopia correction coverage" was 42.2% (95%CI 37.6-46.8%). Fiji refraction and dispensing services should encourage uptake by rural dwellers and promote presbyopia correction. Lack of comparable data from neighbouring countries prevents a regional overview.

  19. Model-based VQ for image data archival, retrieval and distribution

    NASA Technical Reports Server (NTRS)

    Manohar, Mareboyana; Tilton, James C.

    1995-01-01

    An ideal image compression technique for image data archival, retrieval and distribution would be one with the asymmetrical computational requirements of Vector Quantization (VQ), but without the complications arising from VQ codebooks. Codebook generation and maintenance are stumbling blocks which have limited the use of VQ as a practical image compression algorithm. Model-based VQ (MVQ), a variant of VQ described here, has the computational properties of VQ but does not require explicit codebooks. The codebooks are internally generated using mean removed error and Human Visual System (HVS) models. The error model assumed is the Laplacian distribution with mean, lambda-computed from a sample of the input image. A Laplacian distribution with mean, lambda, is generated with uniform random number generator. These random numbers are grouped into vectors. These vectors are further conditioned to make them perceptually meaningful by filtering the DCT coefficients from each vector. The DCT coefficients are filtered by multiplying by a weight matrix that is found to be optimal for human perception. The inverse DCT is performed to produce the conditioned vectors for the codebook. The only image dependent parameter used in the generation of codebook is the mean, lambda, that is included in the coded file to repeat the codebook generation process for decoding.

  20. Modeling and Prediction of Solvent Effect on Human Skin Permeability using Support Vector Regression and Random Forest.

    PubMed

    Baba, Hiromi; Takahara, Jun-ichi; Yamashita, Fumiyoshi; Hashida, Mitsuru

    2015-11-01

    The solvent effect on skin permeability is important for assessing the effectiveness and toxicological risk of new dermatological formulations in pharmaceuticals and cosmetics development. The solvent effect occurs by diverse mechanisms, which could be elucidated by efficient and reliable prediction models. However, such prediction models have been hampered by the small variety of permeants and mixture components archived in databases and by low predictive performance. Here, we propose a solution to both problems. We first compiled a novel large database of 412 samples from 261 structurally diverse permeants and 31 solvents reported in the literature. The data were carefully screened to ensure their collection under consistent experimental conditions. To construct a high-performance predictive model, we then applied support vector regression (SVR) and random forest (RF) with greedy stepwise descriptor selection to our database. The models were internally and externally validated. The SVR achieved higher performance statistics than RF. The (externally validated) determination coefficient, root mean square error, and mean absolute error of SVR were 0.899, 0.351, and 0.268, respectively. Moreover, because all descriptors are fully computational, our method can predict as-yet unsynthesized compounds. Our high-performance prediction model offers an attractive alternative to permeability experiments for pharmaceutical and cosmetic candidate screening and optimizing skin-permeable topical formulations.

Top