Relationships of Measurement Error and Prediction Error in Observed-Score Regression
ERIC Educational Resources Information Center
Moses, Tim
2012-01-01
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
Holmes, John B; Dodds, Ken G; Lee, Michael A
2017-03-02
An important issue in genetic evaluation is the comparability of random effects (breeding values), particularly between pairs of animals in different contemporary groups. This is usually referred to as genetic connectedness. While various measures of connectedness have been proposed in the literature, there is general agreement that the most appropriate measure is some function of the prediction error variance-covariance matrix. However, obtaining the prediction error variance-covariance matrix is computationally demanding for large-scale genetic evaluations. Many alternative statistics have been proposed that avoid the computational cost of obtaining the prediction error variance-covariance matrix, such as counts of genetic links between contemporary groups, gene flow matrices, and functions of the variance-covariance matrix of estimated contemporary group fixed effects. In this paper, we show that a correction to the variance-covariance matrix of estimated contemporary group fixed effects will produce the exact prediction error variance-covariance matrix averaged by contemporary group for univariate models in the presence of single or multiple fixed effects and one random effect. We demonstrate the correction for a series of models and show that approximations to the prediction error matrix based solely on the variance-covariance matrix of estimated contemporary group fixed effects are inappropriate in certain circumstances. Our method allows for the calculation of a connectedness measure based on the prediction error variance-covariance matrix by calculating only the variance-covariance matrix of estimated fixed effects. Since the number of fixed effects in genetic evaluation is usually orders of magnitudes smaller than the number of random effect levels, the computational requirements for our method should be reduced.
Prediction-error variance in Bayesian model updating: a comparative study
NASA Astrophysics Data System (ADS)
Asadollahi, Parisa; Li, Jian; Huang, Yong
2017-04-01
In Bayesian model updating, the likelihood function is commonly formulated by stochastic embedding in which the maximum information entropy probability model of prediction error variances plays an important role and it is Gaussian distribution subject to the first two moments as constraints. The selection of prediction error variances can be formulated as a model class selection problem, which automatically involves a trade-off between the average data-fit of the model class and the information it extracts from the data. Therefore, it is critical for the robustness in the updating of the structural model especially in the presence of modeling errors. To date, three ways of considering prediction error variances have been seem in the literature: 1) setting constant values empirically, 2) estimating them based on the goodness-of-fit of the measured data, and 3) updating them as uncertain parameters by applying Bayes' Theorem at the model class level. In this paper, the effect of different strategies to deal with the prediction error variances on the model updating performance is investigated explicitly. A six-story shear building model with six uncertain stiffness parameters is employed as an illustrative example. Transitional Markov Chain Monte Carlo is used to draw samples of the posterior probability density function of the structure model parameters as well as the uncertain prediction variances. The different levels of modeling uncertainty and complexity are modeled through three FE models, including a true model, a model with more complexity, and a model with modeling error. Bayesian updating is performed for the three FE models considering the three aforementioned treatments of the prediction error variances. The effect of number of measurements on the model updating performance is also examined in the study. The results are compared based on model class assessment and indicate that updating the prediction error variances as uncertain parameters at the model class level produces more robust results especially when the number of measurement is small.
Effect of correlated observation error on parameters, predictions, and uncertainty
Tiedeman, Claire; Green, Christopher T.
2013-01-01
Correlations among observation errors are typically omitted when calculating observation weights for model calibration by inverse methods. We explore the effects of omitting these correlations on estimates of parameters, predictions, and uncertainties. First, we develop a new analytical expression for the difference in parameter variance estimated with and without error correlations for a simple one-parameter two-observation inverse model. Results indicate that omitting error correlations from both the weight matrix and the variance calculation can either increase or decrease the parameter variance, depending on the values of error correlation (ρ) and the ratio of dimensionless scaled sensitivities (rdss). For small ρ, the difference in variance is always small, but for large ρ, the difference varies widely depending on the sign and magnitude of rdss. Next, we consider a groundwater reactive transport model of denitrification with four parameters and correlated geochemical observation errors that are computed by an error-propagation approach that is new for hydrogeologic studies. We compare parameter estimates, predictions, and uncertainties obtained with and without the error correlations. Omitting the correlations modestly to substantially changes parameter estimates, and causes both increases and decreases of parameter variances, consistent with the analytical expression. Differences in predictions for the models calibrated with and without error correlations can be greater than parameter differences when both are considered relative to their respective confidence intervals. These results indicate that including observation error correlations in weighting for nonlinear regression can have important effects on parameter estimates, predictions, and their respective uncertainties.
Generalized Variance Function Applications in Forestry
James Alegria; Charles T. Scott; Charles T. Scott
1991-01-01
Adequately predicting the sampling errors of tabular data can reduce printing costs by eliminating the need to publish separate sampling error tables. Two generalized variance functions (GVFs) found in the literature and three GVFs derived for this study were evaluated for their ability to predict the sampling error of tabular forestry estimates. The recommended GVFs...
Bernard R. Parresol
1993-01-01
In the context of forest modeling, it is often reasonable to assume a multiplicative heteroscedastic error structure to the data. Under such circumstances ordinary least squares no longer provides minimum variance estimates of the model parameters. Through study of the error structure, a suitable error variance model can be specified and its parameters estimated. This...
A log-sinh transformation for data normalization and variance stabilization
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Shrestha, D. L.; Robertson, D. E.; Pokhrel, P.
2012-05-01
When quantifying model prediction uncertainty, it is statistically convenient to represent model errors that are normally distributed with a constant variance. The Box-Cox transformation is the most widely used technique to normalize data and stabilize variance, but it is not without limitations. In this paper, a log-sinh transformation is derived based on a pattern of errors commonly seen in hydrological model predictions. It is suited to applications where prediction variables are positively skewed and the spread of errors is seen to first increase rapidly, then slowly, and eventually approach a constant as the prediction variable becomes greater. The log-sinh transformation is applied in two case studies, and the results are compared with one- and two-parameter Box-Cox transformations.
Eaton, Jeffrey W.; Bao, Le
2017-01-01
Objectives The aim of the study was to propose and demonstrate an approach to allow additional nonsampling uncertainty about HIV prevalence measured at antenatal clinic sentinel surveillance (ANC-SS) in model-based inferences about trends in HIV incidence and prevalence. Design Mathematical model fitted to surveillance data with Bayesian inference. Methods We introduce a variance inflation parameter σinfl2 that accounts for the uncertainty of nonsampling errors in ANC-SS prevalence. It is additive to the sampling error variance. Three approaches are tested for estimating σinfl2 using ANC-SS and household survey data from 40 subnational regions in nine countries in sub-Saharan, as defined in UNAIDS 2016 estimates. Methods were compared using in-sample fit and out-of-sample prediction of ANC-SS data, fit to household survey prevalence data, and the computational implications. Results Introducing the additional variance parameter σinfl2 increased the error variance around ANC-SS prevalence observations by a median of 2.7 times (interquartile range 1.9–3.8). Using only sampling error in ANC-SS prevalence ( σinfl2=0), coverage of 95% prediction intervals was 69% in out-of-sample prediction tests. This increased to 90% after introducing the additional variance parameter σinfl2. The revised probabilistic model improved model fit to household survey prevalence and increased epidemic uncertainty intervals most during the early epidemic period before 2005. Estimating σinfl2 did not increase the computational cost of model fitting. Conclusions: We recommend estimating nonsampling error in ANC-SS as an additional parameter in Bayesian inference using the Estimation and Projection Package model. This approach may prove useful for incorporating other data sources such as routine prevalence from Prevention of mother-to-child transmission testing into future epidemic estimates. PMID:28296801
NASA Astrophysics Data System (ADS)
Behnabian, Behzad; Mashhadi Hossainali, Masoud; Malekzadeh, Ahad
2018-02-01
The cross-validation technique is a popular method to assess and improve the quality of prediction by least squares collocation (LSC). We present a formula for direct estimation of the vector of cross-validation errors (CVEs) in LSC which is much faster than element-wise CVE computation. We show that a quadratic form of CVEs follows Chi-squared distribution. Furthermore, a posteriori noise variance factor is derived by the quadratic form of CVEs. In order to detect blunders in the observations, estimated standardized CVE is proposed as the test statistic which can be applied when noise variances are known or unknown. We use LSC together with the methods proposed in this research for interpolation of crustal subsidence in the northern coast of the Gulf of Mexico. The results show that after detection and removing outliers, the root mean square (RMS) of CVEs and estimated noise standard deviation are reduced about 51 and 59%, respectively. In addition, RMS of LSC prediction error at data points and RMS of estimated noise of observations are decreased by 39 and 67%, respectively. However, RMS of LSC prediction error on a regular grid of interpolation points covering the area is only reduced about 4% which is a consequence of sparse distribution of data points for this case study. The influence of gross errors on LSC prediction results is also investigated by lower cutoff CVEs. It is indicated that after elimination of outliers, RMS of this type of errors is also reduced by 19.5% for a 5 km radius of vicinity. We propose a method using standardized CVEs for classification of dataset into three groups with presumed different noise variances. The noise variance components for each of the groups are estimated using restricted maximum-likelihood method via Fisher scoring technique. Finally, LSC assessment measures were computed for the estimated heterogeneous noise variance model and compared with those of the homogeneous model. The advantage of the proposed method is the reduction in estimated noise levels for those groups with the fewer number of noisy data points.
Quantizing and sampling considerations in digital phased-locked loops
NASA Technical Reports Server (NTRS)
Hurst, G. T.; Gupta, S. C.
1974-01-01
The quantizer problem is first considered. The conditions under which the uniform white sequence model for the quantizer error is valid are established independent of the sampling rate. An equivalent spectral density is defined for the quantizer error resulting in an effective SNR value. This effective SNR may be used to determine quantized performance from infinitely fine quantized results. Attention is given to sampling rate considerations. Sampling rate characteristics of the digital phase-locked loop (DPLL) structure are investigated for the infinitely fine quantized system. The predicted phase error variance equation is examined as a function of the sampling rate. Simulation results are presented and a method is described which enables the minimum required sampling rate to be determined from the predicted phase error variance equations.
The Impact of Truth Surrogate Variance on Quality Assessment/Assurance in Wind Tunnel Testing
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2016-01-01
Minimum data volume requirements for wind tunnel testing are reviewed and shown to depend on error tolerance, response model complexity, random error variance in the measurement environment, and maximum acceptable levels of inference error risk. Distinctions are made between such related concepts as quality assurance and quality assessment in response surface modeling, as well as between precision and accuracy. Earlier research on the scaling of wind tunnel tests is extended to account for variance in the truth surrogates used at confirmation sites in the design space to validate proposed response models. A model adequacy metric is presented that represents the fraction of the design space within which model predictions can be expected to satisfy prescribed quality specifications. The impact of inference error on the assessment of response model residuals is reviewed. The number of sites where reasonably well-fitted response models actually predict inadequately is shown to be considerably less than the number of sites where residuals are out of tolerance. The significance of such inference error effects on common response model assessment strategies is examined.
Predictability Experiments With the Navy Operational Global Atmospheric Prediction System
NASA Astrophysics Data System (ADS)
Reynolds, C. A.; Gelaro, R.; Rosmond, T. E.
2003-12-01
There are several areas of research in numerical weather prediction and atmospheric predictability, such as targeted observations and ensemble perturbation generation, where it is desirable to combine information about the uncertainty of the initial state with information about potential rapid perturbation growth. Singular vectors (SVs) provide a framework to accomplish this task in a mathematically rigorous and computationally feasible manner. In this study, SVs are calculated using the tangent and adjoint models of the Navy Operational Global Atmospheric Prediction System (NOGAPS). The analysis error variance information produced by the NRL Atmospheric Variational Data Assimilation System is used as the initial-time SV norm. These VAR SVs are compared to SVs for which total energy is both the initial and final time norms (TE SVs). The incorporation of analysis error variance information has a significant impact on the structure and location of the SVs. This in turn has a significant impact on targeted observing applications. The utility and implications of such experiments in assessing the analysis error variance estimates will be explored. Computing support has been provided by the Department of Defense High Performance Computing Center at the Naval Oceanographic Office Major Shared Resource Center at Stennis, Mississippi.
Measurement System Characterization in the Presence of Measurement Errors
NASA Technical Reports Server (NTRS)
Commo, Sean A.
2012-01-01
In the calibration of a measurement system, data are collected in order to estimate a mathematical model between one or more factors of interest and a response. Ordinary least squares is a method employed to estimate the regression coefficients in the model. The method assumes that the factors are known without error; yet, it is implicitly known that the factors contain some uncertainty. In the literature, this uncertainty is known as measurement error. The measurement error affects both the estimates of the model coefficients and the prediction, or residual, errors. There are some methods, such as orthogonal least squares, that are employed in situations where measurement errors exist, but these methods do not directly incorporate the magnitude of the measurement errors. This research proposes a new method, known as modified least squares, that combines the principles of least squares with knowledge about the measurement errors. This knowledge is expressed in terms of the variance ratio - the ratio of response error variance to measurement error variance.
Sampling design optimisation for rainfall prediction using a non-stationary geostatistical model
NASA Astrophysics Data System (ADS)
Wadoux, Alexandre M. J.-C.; Brus, Dick J.; Rico-Ramirez, Miguel A.; Heuvelink, Gerard B. M.
2017-09-01
The accuracy of spatial predictions of rainfall by merging rain-gauge and radar data is partly determined by the sampling design of the rain-gauge network. Optimising the locations of the rain-gauges may increase the accuracy of the predictions. Existing spatial sampling design optimisation methods are based on minimisation of the spatially averaged prediction error variance under the assumption of intrinsic stationarity. Over the past years, substantial progress has been made to deal with non-stationary spatial processes in kriging. Various well-documented geostatistical models relax the assumption of stationarity in the mean, while recent studies show the importance of considering non-stationarity in the variance for environmental processes occurring in complex landscapes. We optimised the sampling locations of rain-gauges using an extension of the Kriging with External Drift (KED) model for prediction of rainfall fields. The model incorporates both non-stationarity in the mean and in the variance, which are modelled as functions of external covariates such as radar imagery, distance to radar station and radar beam blockage. Spatial predictions are made repeatedly over time, each time recalibrating the model. The space-time averaged KED variance was minimised by Spatial Simulated Annealing (SSA). The methodology was tested using a case study predicting daily rainfall in the north of England for a one-year period. Results show that (i) the proposed non-stationary variance model outperforms the stationary variance model, and (ii) a small but significant decrease of the rainfall prediction error variance is obtained with the optimised rain-gauge network. In particular, it pays off to place rain-gauges at locations where the radar imagery is inaccurate, while keeping the distribution over the study area sufficiently uniform.
NASA Astrophysics Data System (ADS)
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the application of BJI with a GA error model outperforms the hydrological parameters robustness (diminishing the divergence model phenomenon) and improves the reliability of the streamflow predictive distribution, in respect of the results of a bad error model as SLS. Finally, the most likely prediction in a validation period, for both BJI+GA and SLS error models shows a similar performance.
Poston, Brach; Van Gemmert, Arend W.A.; Sharma, Siddharth; Chakrabarti, Somesh; Zavaremi, Shahrzad H.; Stelmach, George
2013-01-01
The minimum variance theory proposes that motor commands are corrupted by signal-dependent noise and smooth trajectories with low noise levels are selected to minimize endpoint error and endpoint variability. The purpose of the study was to determine the contribution of trajectory smoothness to the endpoint accuracy and endpoint variability of rapid multi-joint arm movements. Young and older adults performed arm movements (4 blocks of 25 trials) as fast and as accurately as possible to a target with the right (dominant) arm. Endpoint accuracy and endpoint variability along with trajectory smoothness and error were quantified for each block of trials. Endpoint error and endpoint variance were greater in older adults compared with young adults, but decreased at a similar rate with practice for the two age groups. The greater endpoint error and endpoint variance exhibited by older adults were primarily due to impairments in movement extent control and not movement direction control. The normalized jerk was similar for the two age groups, but was not strongly associated with endpoint error or endpoint variance for either group. However, endpoint variance was strongly associated with endpoint error for both the young and older adults. Finally, trajectory error was similar for both groups and was weakly associated with endpoint error for the older adults. The findings are not consistent with the predictions of the minimum variance theory, but support and extend previous observations that movement trajectories and endpoints are planned independently. PMID:23584101
Distribution of kriging errors, the implications and how to communicate them
NASA Astrophysics Data System (ADS)
Li, Hong Yi; Milne, Alice; Webster, Richard
2016-04-01
Kriging in one form or another has become perhaps the most popular method for spatial prediction in environmental science. Each prediction is unbiased and of minimum variance, which itself is estimated. The kriging variances depend on the mathematical model chosen to describe the spatial variation; different models, however plausible, give rise to different minimized variances. Practitioners often compare models by so-called cross-validation before finally choosing the most appropriate for their kriging. One proceeds as follows. One removes a unit (a sampling point) from the whole set, kriges the value there and compares the kriged value with the value observed to obtain the deviation or error. One repeats the process for each and every point in turn and for all plausible models. One then computes the mean errors (MEs) and the mean of the squared errors (MSEs). Ideally a squared error should equal the corresponding kriging variance (σK2), and so one is advised to choose the model for which on average the squared errors most nearly equal the kriging variances, i.e. the ratio MSDR = MSE/σK2 ≈ 1. Maximum likelihood estimation of models almost guarantees that the MSDR equals 1, and so the kriging variances are unbiased predictors of the squared error across the region. The method is based on the assumption that the errors have a normal distribution. The squared deviation ratio (SDR) should therefore be distributed as χ2 with one degree of freedom with a median of 0.455. We have found that often the median of the SDR (MedSDR) is less, in some instances much less, than 0.455 even though the mean of the SDR is close to 1. It seems that in these cases the distributions of the errors are leptokurtic, i.e. they have an excess of predictions close to the true values, excesses near the extremes and a dearth of predictions in between. In these cases the kriging variances are poor measures of the uncertainty at individual sites. The uncertainty is typically under-estimated for the extreme observations and compensated for by over estimating for other observations. Statisticians must tell users when they present maps of predictions. We illustrate the situation with results from mapping salinity in land reclaimed from the Yangtze delta in the Gulf of Hangzhou, China. There the apparent electrical conductivity (ECa) of the topsoil was measured at 525 points in a field of 2.3 ha. The marginal distribution of the observations was strongly positively skewed, and so the observed ECas were transformed to their logarithms to give an approximately symmetric distribution. That distribution was strongly platykurtic with short tails and no evident outliers. The logarithms were analysed as a mixed model of quadratic drift plus correlated random residuals with a spherical variogram. The kriged predictions that deviated from their true values with an MSDR of 0.993, but with a medSDR=0.324. The coefficient of kurtosis of the deviations was 1.45, i.e. substantially larger than 0 for a normal distribution. The reasons for this behaviour are being sought. The most likely explanation is that there are spatial outliers, i.e. points at which the observed values that differ markedly from those at their their closest neighbours.
Distribution of kriging errors, the implications and how to communicate them
NASA Astrophysics Data System (ADS)
Li, HongYi; Milne, Alice; Webster, Richard
2015-04-01
Kriging in one form or another has become perhaps the most popular method for spatial prediction in environmental science. Each prediction is unbiased and of minimum variance, which itself is estimated. The kriging variances depend on the mathematical model chosen to describe the spatial variation; different models, however plausible, give rise to different minimized variances. Practitioners often compare models by so-called cross-validation before finally choosing the most appropriate for their kriging. One proceeds as follows. One removes a unit (a sampling point) from the whole set, kriges the value there and compares the kriged value with the value observed to obtain the deviation or error. One repeats the process for each and every point in turn and for all plausible models. One then computes the mean errors (MEs) and the mean of the squared errors (MSEs). Ideally a squared error should equal the corresponding kriging variance (σ_K^2), and so one is advised to choose the model for which on average the squared errors most nearly equal the kriging variances, i.e. the ratio MSDR=MSE/ σ_K2 ≈1. Maximum likelihood estimation of models almost guarantees that the MSDR equals 1, and so the kriging variances are unbiased predictors of the squared error across the region. The method is based on the assumption that the errors have a normal distribution. The squared deviation ratio (SDR) should therefore be distributed as χ2 with one degree of freedom with a median of 0.455. We have found that often the median of the SDR (MedSDR) is less, in some instances much less, than 0.455 even though the mean of the SDR is close to 1. It seems that in these cases the distributions of the errors are leptokurtic, i.e. they have an excess of predictions close to the true values, excesses near the extremes and a dearth of predictions in between. In these cases the kriging variances are poor measures of the uncertainty at individual sites. The uncertainty is typically under-estimated for the extreme observations and compensated for by over estimating for other observations. Statisticians must tell users when they present maps of predictions. We illustrate the situation with results from mapping salinity in land reclaimed from the Yangtze delta in the Gulf of Hangzhou, China. There the apparent electrical conductivity (EC_a) of the topsoil was measured at 525 points in a field of 2.3~ha. The marginal distribution of the observations was strongly positively skewed, and so the observed EC_as were transformed to their logarithms to give an approximately symmetric distribution. That distribution was strongly platykurtic with short tails and no evident outliers. The logarithms were analysed as a mixed model of quadratic drift plus correlated random residuals with a spherical variogram. The kriged predictions that deviated from their true values with an MSDR of 0.993, but with a medSDR=0.324. The coefficient of kurtosis of the deviations was 1.45, i.e. substantially larger than 0 for a normal distribution. The reasons for this behaviour are being sought. The most likely explanation is that there are spatial outliers, i.e. points at which the observed values that differ markedly from those at their their closest neighbours.
NASA Astrophysics Data System (ADS)
Bukhari, W.; Hong, S.-M.
2015-01-01
Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR+, implements a gating function without pre-specifying a particular region of the patient’s breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR+ algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR+ implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR+ in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR+. The experimental results show that the EKF-GPR+ algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR+ reduces the patient-wise RMS error to 37%, 39% and 42% in percent ratios relative to no prediction for a duty cycle of 80% at lookahead lengths of 192 ms, 384 ms and 576 ms, respectively. The experiments also confirm that EKF-GPR+ controls the duty cycle with reasonable accuracy.
Assumption-free estimation of the genetic contribution to refractive error across childhood.
Guggenheim, Jeremy A; St Pourcain, Beate; McMahon, George; Timpson, Nicholas J; Evans, David M; Williams, Cathy
2015-01-01
Studies in relatives have generally yielded high heritability estimates for refractive error: twins 75-90%, families 15-70%. However, because related individuals often share a common environment, these estimates are inflated (via misallocation of unique/common environment variance). We calculated a lower-bound heritability estimate for refractive error free from such bias. Between the ages 7 and 15 years, participants in the Avon Longitudinal Study of Parents and Children (ALSPAC) underwent non-cycloplegic autorefraction at regular research clinics. At each age, an estimate of the variance in refractive error explained by single nucleotide polymorphism (SNP) genetic variants was calculated using genome-wide complex trait analysis (GCTA) using high-density genome-wide SNP genotype information (minimum N at each age=3,404). The variance in refractive error explained by the SNPs ("SNP heritability") was stable over childhood: Across age 7-15 years, SNP heritability averaged 0.28 (SE=0.08, p<0.001). The genetic correlation for refractive error between visits varied from 0.77 to 1.00 (all p<0.001) demonstrating that a common set of SNPs was responsible for the genetic contribution to refractive error across this period of childhood. Simulations suggested lack of cycloplegia during autorefraction led to a small underestimation of SNP heritability (adjusted SNP heritability=0.35; SE=0.09). To put these results in context, the variance in refractive error explained (or predicted) by the time participants spent outdoors was <0.005 and by the time spent reading was <0.01, based on a parental questionnaire completed when the child was aged 8-9 years old. Genetic variation captured by common SNPs explained approximately 35% of the variation in refractive error between unrelated subjects. This value sets an upper limit for predicting refractive error using existing SNP genotyping arrays, although higher-density genotyping in larger samples and inclusion of interaction effects is expected to raise this figure toward twin- and family-based heritability estimates. The same SNPs influenced refractive error across much of childhood. Notwithstanding the strong evidence of association between time outdoors and myopia, and time reading and myopia, less than 1% of the variance in myopia at age 15 was explained by crude measures of these two risk factors, indicating that their effects may be limited, at least when averaged over the whole population.
High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis
Daye, Z. John; Chen, Jinbo; Li, Hongzhe
2011-01-01
Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833
NASA Astrophysics Data System (ADS)
de Montera, L.; Mallet, C.; Barthès, L.; Golé, P.
2008-08-01
This paper shows how nonlinear models originally developed in the finance field can be used to predict rain attenuation level and volatility in Earth-to-Satellite links operating at the Extremely High Frequencies band (EHF, 20 50 GHz). A common approach to solving this problem is to consider that the prediction error corresponds only to scintillations, whose variance is assumed to be constant. Nevertheless, this assumption does not seem to be realistic because of the heteroscedasticity of error time series: the variance of the prediction error is found to be time-varying and has to be modeled. Since rain attenuation time series behave similarly to certain stocks or foreign exchange rates, a switching ARIMA/GARCH model was implemented. The originality of this model is that not only the attenuation level, but also the error conditional distribution are predicted. It allows an accurate upper-bound of the future attenuation to be estimated in real time that minimizes the cost of Fade Mitigation Techniques (FMT) and therefore enables the communication system to reach a high percentage of availability. The performance of the switching ARIMA/GARCH model was estimated using a measurement database of the Olympus satellite 20/30 GHz beacons and this model is shown to outperform significantly other existing models. The model also includes frequency scaling from the downlink frequency to the uplink frequency. The attenuation effects (gases, clouds and rain) are first separated with a neural network and then scaled using specific scaling factors. As to the resulting uplink prediction error, the error contribution of the frequency scaling step is shown to be larger than that of the downlink prediction, indicating that further study should focus on improving the accuracy of the scaling factor.
Data Analysis and Its Impact on Predicting Schedule & Cost Risk
2006-03-01
variance of the error term by performing a Breusch - Pagan test for constant variance (Neter et al., 1996:239). In order to test the normality of...is constant variance. Using Microsoft Excel®, we calculate a p- 68 value of 0.225678 for the Breusch - Pagan test . We again compare this p-value to...calculate a p-value of 0.121211092 Breusch - Pagan test . We again compare this p-value to an alpha of 0.05 indicating our assumption of constant variance
Brandmaier, Andreas M.; von Oertzen, Timo; Ghisletta, Paolo; Lindenberger, Ulman; Hertzog, Christopher
2018-01-01
Latent Growth Curve Models (LGCM) have become a standard technique to model change over time. Prediction and explanation of inter-individual differences in change are major goals in lifespan research. The major determinants of statistical power to detect individual differences in change are the magnitude of true inter-individual differences in linear change (LGCM slope variance), design precision, alpha level, and sample size. Here, we show that design precision can be expressed as the inverse of effective error. Effective error is determined by instrument reliability and the temporal arrangement of measurement occasions. However, it also depends on another central LGCM component, the variance of the latent intercept and its covariance with the latent slope. We derive a new reliability index for LGCM slope variance—effective curve reliability (ECR)—by scaling slope variance against effective error. ECR is interpretable as a standardized effect size index. We demonstrate how effective error, ECR, and statistical power for a likelihood ratio test of zero slope variance formally relate to each other and how they function as indices of statistical power. We also provide a computational approach to derive ECR for arbitrary intercept-slope covariance. With practical use cases, we argue for the complementary utility of the proposed indices of a study's sensitivity to detect slope variance when making a priori longitudinal design decisions or communicating study designs. PMID:29755377
Schroeder, Scott R; Salomon, Meghan M; Galanter, William L; Schiff, Gordon D; Vaida, Allen J; Gaunt, Michael J; Bryson, Michelle L; Rash, Christine; Falck, Suzanne; Lambert, Bruce L
2017-01-01
Background Drug name confusion is a common type of medication error and a persistent threat to patient safety. In the USA, roughly one per thousand prescriptions results in the wrong drug being filled, and most of these errors involve drug names that look or sound alike. Prior to approval, drug names undergo a variety of tests to assess their potential for confusability, but none of these preapproval tests has been shown to predict real-world error rates. Objectives We conducted a study to assess the association between error rates in laboratory-based tests of drug name memory and perception and real-world drug name confusion error rates. Methods Eighty participants, comprising doctors, nurses, pharmacists, technicians and lay people, completed a battery of laboratory tests assessing visual perception, auditory perception and short-term memory of look-alike and sound-alike drug name pairs (eg, hydroxyzine/hydralazine). Results Laboratory test error rates (and other metrics) significantly predicted real-world error rates obtained from a large, outpatient pharmacy chain, with the best-fitting model accounting for 37% of the variance in real-world error rates. Cross-validation analyses confirmed these results, showing that the laboratory tests also predicted errors from a second pharmacy chain, with 45% of the variance being explained by the laboratory test data. Conclusions Across two distinct pharmacy chains, there is a strong and significant association between drug name confusion error rates observed in the real world and those observed in laboratory-based tests of memory and perception. Regulators and drug companies seeking a validated preapproval method for identifying confusing drug names ought to consider using these simple tests. By using a standard battery of memory and perception tests, it should be possible to reduce the number of confusing look-alike and sound-alike drug name pairs that reach the market, which will help protect patients from potentially harmful medication errors. PMID:27193033
Error-related brain activity predicts cocaine use after treatment at 3-month follow-up.
Marhe, Reshmi; van de Wetering, Ben J M; Franken, Ingmar H A
2013-04-15
Relapse after treatment is one of the most important problems in drug dependency. Several studies suggest that lack of cognitive control is one of the causes of relapse. In this study, a relative new electrophysiologic index of cognitive control, the error-related negativity, is investigated to examine its suitability as a predictor of relapse. The error-related negativity was measured in 57 cocaine-dependent patients during their first week in detoxification treatment. Data from 49 participants were used to predict cocaine use at 3-month follow-up. Cocaine use at follow-up was measured by means of self-reported days of cocaine use in the last month verified by urine screening. A multiple hierarchical regression model was used to examine the predictive value of the error-related negativity while controlling for addiction severity and self-reported craving in the week before treatment. The error-related negativity was the only significant predictor in the model and added 7.4% of explained variance to the control variables, resulting in a total of 33.4% explained variance in the prediction of days of cocaine use at follow-up. A reduced error-related negativity measured during the first week of treatment was associated with more days of cocaine use at 3-month follow-up. Moreover, the error-related negativity was a stronger predictor of recent cocaine use than addiction severity and craving. These results suggest that underactive error-related brain activity might help to identify patients who are at risk of relapse as early as in the first week of detoxification treatment. Copyright © 2013 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Estimating Model Prediction Error: Should You Treat Predictions as Fixed or Random?
NASA Technical Reports Server (NTRS)
Wallach, Daniel; Thorburn, Peter; Asseng, Senthold; Challinor, Andrew J.; Ewert, Frank; Jones, James W.; Rotter, Reimund; Ruane, Alexander
2016-01-01
Crop models are important tools for impact assessment of climate change, as well as for exploring management options under current climate. It is essential to evaluate the uncertainty associated with predictions of these models. We compare two criteria of prediction error; MSEP fixed, which evaluates mean squared error of prediction for a model with fixed structure, parameters and inputs, and MSEP uncertain( X), which evaluates mean squared error averaged over the distributions of model structure, inputs and parameters. Comparison of model outputs with data can be used to estimate the former. The latter has a squared bias term, which can be estimated using hindcasts, and a model variance term, which can be estimated from a simulation experiment. The separate contributions to MSEP uncertain (X) can be estimated using a random effects ANOVA. It is argued that MSEP uncertain (X) is the more informative uncertainty criterion, because it is specific to each prediction situation.
How many drinks did you have on September 11, 2001?
Perrine, M W Bud; Schroder, Kerstin E E
2005-07-01
This study tested the predictability of error in retrospective self-reports of alcohol consumption on September 11, 2001, among 80 Vermont light, medium and heavy drinkers. Subjects were 52 men and 28 women participating in daily self-reports of alcohol consumption for a total of 2 years, collected via interactive voice response technology (IVR). In addition, retrospective self-reports of alcohol consumption on September 11, 2001, were collected by telephone interview 4-5 days following the terrorist attacks. Retrospective error was calculated as the difference between the IVR self-report of drinking behavior on September 11 and the retrospective self-report collected by telephone interview. Retrospective error was analyzed as a function of gender and baseline drinking behavior during the 365 days preceding September 11, 2001 (termed "the baseline"). The intraclass correlation (ICC) between daily IVR and retrospective self-reports of alcohol consumption on September 11 was .80. Women provided, on average, more accurate self-reports (ICC = .96) than men (ICC = .72) but displayed more underreporting bias in retrospective responses. Amount and individual variability of alcohol consumption during the 1-year baseline explained, on average, 11% of the variance in overreporting (r = .33), 9% of the variance in underreporting (r = .30) and 25% of the variance in the overall magnitude of error (r = .50), with correlations up to .62 (r2 = .38). The size and direction of error were clearly predictable from the amount and variation in drinking behavior during the 1-year baseline period. The results demonstrate the utility and detail of information that can be derived from daily IVR self-reports in the analysis of retrospective error.
Somarathna, P D S N; Minasny, Budiman; Malone, Brendan P; Stockmann, Uta; McBratney, Alex B
2018-08-01
Spatial modelling of environmental data commonly only considers spatial variability as the single source of uncertainty. In reality however, the measurement errors should also be accounted for. In recent years, infrared spectroscopy has been shown to offer low cost, yet invaluable information needed for digital soil mapping at meaningful spatial scales for land management. However, spectrally inferred soil carbon data are known to be less accurate compared to laboratory analysed measurements. This study establishes a methodology to filter out the measurement error variability by incorporating the measurement error variance in the spatial covariance structure of the model. The study was carried out in the Lower Hunter Valley, New South Wales, Australia where a combination of laboratory measured, and vis-NIR and MIR inferred topsoil and subsoil soil carbon data are available. We investigated the applicability of residual maximum likelihood (REML) and Markov Chain Monte Carlo (MCMC) simulation methods to generate parameters of the Matérn covariance function directly from the data in the presence of measurement error. The results revealed that the measurement error can be effectively filtered-out through the proposed technique. When the measurement error was filtered from the data, the prediction variance almost halved, which ultimately yielded a greater certainty in spatial predictions of soil carbon. Further, the MCMC technique was successfully used to define the posterior distribution of measurement error. This is an important outcome, as the MCMC technique can be used to estimate the measurement error if it is not explicitly quantified. Although this study dealt with soil carbon data, this method is amenable for filtering the measurement error of any kind of continuous spatial environmental data. Copyright © 2018 Elsevier B.V. All rights reserved.
A new statistic to express the uncertainty of kriging predictions for purposes of survey planning.
NASA Astrophysics Data System (ADS)
Lark, R. M.; Lapworth, D. J.
2014-05-01
It is well-known that one advantage of kriging for spatial prediction is that, given the random effects model, the prediction error variance can be computed a priori for alternative sampling designs. This allows one to compare sampling schemes, in particular sampling at different densities, and so to decide on one which meets requirements in terms of the uncertainty of the resulting predictions. However, the planning of sampling schemes must account not only for statistical considerations, but also logistics and cost. This requires effective communication between statisticians, soil scientists and data users/sponsors such as managers, regulators or civil servants. In our experience the latter parties are not necessarily able to interpret the prediction error variance as a measure of uncertainty for decision making. In some contexts (particularly the solution of very specific problems at large cartographic scales, e.g. site remediation and precision farming) it is possible to translate uncertainty of predictions into a loss function directly comparable with the cost incurred in increasing precision. Often, however, sampling must be planned for more generic purposes (e.g. baseline or exploratory geochemical surveys). In this latter context the prediction error variance may be of limited value to a non-statistician who has to make a decision on sample intensity and associated cost. We propose an alternative criterion for these circumstances to aid communication between statisticians and data users about the uncertainty of geostatistical surveys based on different sampling intensities. The criterion is the consistency of estimates made from two non-coincident instantiations of a proposed sample design. We consider square sample grids, one instantiation is offset from the second by half the grid spacing along the rows and along the columns. If a sample grid is coarse relative to the important scales of variation in the target property then the consistency of predictions from two instantiations is expected to be small, and can be increased by reducing the grid spacing. The measure of consistency is the correlation between estimates from the two instantiations of the sample grid, averaged over a grid cell. We call this the offset correlation, it can be calculated from the variogram. We propose that this measure is easier to grasp intuitively than the prediction error variance, and has the advantage of having an upper bound (1.0) which will aid its interpretation. This quality measure is illustrated for some hypothetical examples, considering both ordinary kriging and factorial kriging of the variable of interest. It is also illustrated using data on metal concentrations in the soil of north-east England.
Genomic Prediction Accounting for Residual Heteroskedasticity
Ou, Zhining; Tempelman, Robert J.; Steibel, Juan P.; Ernst, Catherine W.; Bates, Ronald O.; Bello, Nora M.
2015-01-01
Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. PMID:26564950
Modeling Errors in Daily Precipitation Measurements: Additive or Multiplicative?
NASA Technical Reports Server (NTRS)
Tian, Yudong; Huffman, George J.; Adler, Robert F.; Tang, Ling; Sapiano, Matthew; Maggioni, Viviana; Wu, Huan
2013-01-01
The definition and quantification of uncertainty depend on the error model used. For uncertainties in precipitation measurements, two types of error models have been widely adopted: the additive error model and the multiplicative error model. This leads to incompatible specifications of uncertainties and impedes intercomparison and application.In this letter, we assess the suitability of both models for satellite-based daily precipitation measurements in an effort to clarify the uncertainty representation. Three criteria were employed to evaluate the applicability of either model: (1) better separation of the systematic and random errors; (2) applicability to the large range of variability in daily precipitation; and (3) better predictive skills. It is found that the multiplicative error model is a much better choice under all three criteria. It extracted the systematic errors more cleanly, was more consistent with the large variability of precipitation measurements, and produced superior predictions of the error characteristics. The additive error model had several weaknesses, such as non constant variance resulting from systematic errors leaking into random errors, and the lack of prediction capability. Therefore, the multiplicative error model is a better choice.
Zhang, Ji-Li; Liu, Bo-Fei; Di, Xue-Ying; Chu, Teng-Fei; Jin, Sen
2012-11-01
Taking fuel moisture content, fuel loading, and fuel bed depth as controlling factors, the fuel beds of Mongolian oak leaves in Maoershan region of Northeast China in field were simulated, and a total of one hundred experimental burnings under no-wind and zero-slope conditions were conducted in laboratory, with the effects of the fuel moisture content, fuel loading, and fuel bed depth on the flame length and its residence time analyzed and the multivariate linear prediction models constructed. The results indicated that fuel moisture content had a significant negative liner correlation with flame length, but less correlation with flame residence time. Both the fuel loading and the fuel bed depth were significantly positively correlated with flame length and its residence time. The interactions of fuel bed depth with fuel moisture content and fuel loading had significant effects on the flame length, while the interactions of fuel moisture content with fuel loading and fuel bed depth affected the flame residence time significantly. The prediction model of flame length had better prediction effect, which could explain 83.3% of variance, with a mean absolute error of 7.8 cm and a mean relative error of 16.2%, while the prediction model of flame residence time was not good enough, which could only explain 54% of variance, with a mean absolute error of 9.2 s and a mean relative error of 18.6%.
Discordance between net analyte signal theory and practical multivariate calibration.
Brown, Christopher D
2004-08-01
Lorber's concept of net analyte signal is reviewed in the context of classical and inverse least-squares approaches to multivariate calibration. It is shown that, in the presence of device measurement error, the classical and inverse calibration procedures have radically different theoretical prediction objectives, and the assertion that the popular inverse least-squares procedures (including partial least squares, principal components regression) approximate Lorber's net analyte signal vector in the limit is disproved. Exact theoretical expressions for the prediction error bias, variance, and mean-squared error are given under general measurement error conditions, which reinforce the very discrepant behavior between these two predictive approaches, and Lorber's net analyte signal theory. Implications for multivariate figures of merit and numerous recently proposed preprocessing treatments involving orthogonal projections are also discussed.
Research on Improved Depth Belief Network-Based Prediction of Cardiovascular Diseases
Zhang, Hongpo
2018-01-01
Quantitative analysis and prediction can help to reduce the risk of cardiovascular disease. Quantitative prediction based on traditional model has low accuracy. The variance of model prediction based on shallow neural network is larger. In this paper, cardiovascular disease prediction model based on improved deep belief network (DBN) is proposed. Using the reconstruction error, the network depth is determined independently, and unsupervised training and supervised optimization are combined. It ensures the accuracy of model prediction while guaranteeing stability. Thirty experiments were performed independently on the Statlog (Heart) and Heart Disease Database data sets in the UCI database. Experimental results showed that the mean of prediction accuracy was 91.26% and 89.78%, respectively. The variance of prediction accuracy was 5.78 and 4.46, respectively. PMID:29854369
Steen Magnussen; Ronald E. McRoberts; Erkki O. Tomppo
2009-01-01
New model-based estimators of the uncertainty of pixel-level and areal k-nearest neighbour (knn) predictions of attribute Y from remotely-sensed ancillary data X are presented. Non-parametric functions predict Y from scalar 'Single Index Model' transformations of X. Variance functions generated...
Schroeder, Scott R; Salomon, Meghan M; Galanter, William L; Schiff, Gordon D; Vaida, Allen J; Gaunt, Michael J; Bryson, Michelle L; Rash, Christine; Falck, Suzanne; Lambert, Bruce L
2017-05-01
Drug name confusion is a common type of medication error and a persistent threat to patient safety. In the USA, roughly one per thousand prescriptions results in the wrong drug being filled, and most of these errors involve drug names that look or sound alike. Prior to approval, drug names undergo a variety of tests to assess their potential for confusability, but none of these preapproval tests has been shown to predict real-world error rates. We conducted a study to assess the association between error rates in laboratory-based tests of drug name memory and perception and real-world drug name confusion error rates. Eighty participants, comprising doctors, nurses, pharmacists, technicians and lay people, completed a battery of laboratory tests assessing visual perception, auditory perception and short-term memory of look-alike and sound-alike drug name pairs (eg, hydroxyzine/hydralazine). Laboratory test error rates (and other metrics) significantly predicted real-world error rates obtained from a large, outpatient pharmacy chain, with the best-fitting model accounting for 37% of the variance in real-world error rates. Cross-validation analyses confirmed these results, showing that the laboratory tests also predicted errors from a second pharmacy chain, with 45% of the variance being explained by the laboratory test data. Across two distinct pharmacy chains, there is a strong and significant association between drug name confusion error rates observed in the real world and those observed in laboratory-based tests of memory and perception. Regulators and drug companies seeking a validated preapproval method for identifying confusing drug names ought to consider using these simple tests. By using a standard battery of memory and perception tests, it should be possible to reduce the number of confusing look-alike and sound-alike drug name pairs that reach the market, which will help protect patients from potentially harmful medication errors. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Weighted linear regression using D2H and D2 as the independent variables
Hans T. Schreuder; Michael S. Williams
1998-01-01
Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...
Fisher, Moria E; Huang, Felix C; Wright, Zachary A; Patton, James L
2014-01-01
Manipulation of error feedback has been of great interest to recent studies in motor control and rehabilitation. Typically, motor adaptation is shown as a change in performance with a single scalar metric for each trial, yet such an approach might overlook details about how error evolves through the movement. We believe that statistical distributions of movement error through the extent of the trajectory can reveal unique patterns of adaption and possibly reveal clues to how the motor system processes information about error. This paper describes different possible ordinate domains, focusing on representations in time and state-space, used to quantify reaching errors. We hypothesized that the domain with the lowest amount of variability would lead to a predictive model of reaching error with the highest accuracy. Here we showed that errors represented in a time domain demonstrate the least variance and allow for the highest predictive model of reaching errors. These predictive models will give rise to more specialized methods of robotic feedback and improve previous techniques of error augmentation.
Genomic Prediction Accounting for Residual Heteroskedasticity.
Ou, Zhining; Tempelman, Robert J; Steibel, Juan P; Ernst, Catherine W; Bates, Ronald O; Bello, Nora M
2015-11-12
Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. Copyright © 2016 Ou et al.
Attentional effects on orientation judgements are dependent on memory consolidation processes.
Haskell, Christie; Anderson, Britt
2016-11-01
Are the effects of memory and attention on perception synergistic, antagonistic, or independent? Tested separately, memory and attention have been shown to affect the accuracy of orientation judgements. When multiple stimuli are presented sequentially versus simultaneously, error variance is reduced. When a target is validly cued, precision is increased. What if they are manipulated together? We combined memory and attention manipulations in an orientation judgement task to answer this question. Two circular gratings were presented sequentially or simultaneously. On some trials a brief luminance cue preceded the stimuli. Participants were cued to report the orientation of one of the two gratings by rotating a response grating. We replicated the finding that error variance is reduced on sequential trials. Critically, we found interacting effects of memory and attention. Valid cueing reduced the median, absolute error only when two stimuli appeared together and improved it to the level of performance on uncued sequential trials, whereas invalid cueing always increased error. This effect was not mediated by cue predictiveness; however, predictive cues reduced the standard deviation of the error distribution, whereas nonpredictive cues reduced "guessing". Our results suggest that, when the demand on memory is greater than a single stimulus, attention is a bottom-up process that prioritizes stimuli for consolidation. Thus attention and memory are synergistic.
Inventory implications of using sampling variances in estimation of growth model coefficients
Albert R. Stage; William R. Wykoff
2000-01-01
Variables based on stand densities or stocking have sampling errors that depend on the relation of tree size to plot size and on the spatial structure of the population, ignoring the sampling errors of such variables, which include most measures of competition used in both distance-dependent and distance-independent growth models, can bias the predictions obtained from...
Small Area Variance Estimation for the Siuslaw NF in Oregon and Some Results
S. Lin; D. Boes; H.T. Schreuder
2006-01-01
The results of a small area prediction study for the Siuslaw National Forest in Oregon are presented. Predictions were made for total basal area, number of trees and mortality per ha on a 0.85 mile grid using data on a 1.7 mile grid and additional ancillary information from TM. A reliable method of estimating prediction errors for individual plot predictions called the...
Estimation of lipids and lean mass of migrating sandpipers
Skagen, Susan K.; Knopf, Fritz L.; Cade, Brian S.
1993-01-01
Estimation of lean mass and lipid levels in birds involves the derivation of predictive equations that relate morphological measurements and, more recently, total body electrical conductivity (TOBEC) indices to known lean and lipid masses. Using cross-validation techniques, we evaluated the ability of several published and new predictive equations to estimate lean and lipid mass of Semipalmated Sandpipers (Calidris pusilla) and White-rumped Sandpipers (C. fuscicollis). We also tested ideas of Morton et al. (1991), who stated that current statistical approaches to TOBEC methodology misrepresent precision in estimating body fat. Three published interspecific equations using TOBEC indices predicted lean and lipid masses of our sample of birds with average errors of 8-28% and 53-155%, respectively. A new two-species equation relating lean mass and TOBEC indices revealed average errors of 4.6% and 23.2% in predicting lean and lipid mass, respectively. New intraspecific equations that estimate lipid mass directly from body mass, morphological measurements, and TOBEC indices yielded about a 13% error in lipid estimates. Body mass and morphological measurements explained a substantial portion of the variance (about 90%) in fat mass of both species. Addition of TOBEC indices improved the predictive model more for the smaller than for the larger sandpiper. TOBEC indices explained an additional 7.8% and 2.6% of the variance in fat mass and reduced the minimum breadth of prediction intervals by 0.95 g (32%) and 0.39 g (13%) for Semipalmated and White-rumped Sandpipers, respectively. The breadth of prediction intervals for models used to predict fat levels of individual birds must be considered when interpreting the resultant lipid estimates.
Hedging Your Bets by Learning Reward Correlations in the Human Brain
Wunderlich, Klaus; Symmonds, Mkael; Bossaerts, Peter; Dolan, Raymond J.
2011-01-01
Summary Human subjects are proficient at tracking the mean and variance of rewards and updating these via prediction errors. Here, we addressed whether humans can also learn about higher-order relationships between distinct environmental outcomes, a defining ecological feature of contexts where multiple sources of rewards are available. By manipulating the degree to which distinct outcomes are correlated, we show that subjects implemented an explicit model-based strategy to learn the associated outcome correlations and were adept in using that information to dynamically adjust their choices in a task that required a minimization of outcome variance. Importantly, the experimentally generated outcome correlations were explicitly represented neuronally in right midinsula with a learning prediction error signal expressed in rostral anterior cingulate cortex. Thus, our data show that the human brain represents higher-order correlation structures between rewards, a core adaptive ability whose immediate benefit is optimized sampling. PMID:21943609
NASA Technical Reports Server (NTRS)
Nese, Jon M.; Dutton, John A.
1993-01-01
The predictability of the weather and climatic states of a low-order moist general circulation model is quantified using a dynamic systems approach, and the effect of incorporating a simple oceanic circulation on predictability is evaluated. The predictability and the structure of the model attractors are compared using Liapunov exponents, local divergence rates, and the correlation and Liapunov dimensions. It was found that the activation of oceanic circulation increases the average error doubling time of the atmosphere and the coupled ocean-atmosphere system by 10 percent and decreases the variance of the largest local divergence rate by 20 percent. When an oceanic circulation develops, the average predictability of annually averaged states is improved by 25 percent and the variance of the largest local divergence rate decreases by 25 percent.
Thermospheric mass density model error variance as a function of time scale
NASA Astrophysics Data System (ADS)
Emmert, J. T.; Sutton, E. K.
2017-12-01
In the increasingly crowded low-Earth orbit environment, accurate estimation of orbit prediction uncertainties is essential for collision avoidance. Poor characterization of such uncertainty can result in unnecessary and costly avoidance maneuvers (false positives) or disregard of a collision risk (false negatives). Atmospheric drag is a major source of orbit prediction uncertainty, and is particularly challenging to account for because it exerts a cumulative influence on orbital trajectories and is therefore not amenable to representation by a single uncertainty parameter. To address this challenge, we examine the variance of measured accelerometer-derived and orbit-derived mass densities with respect to predictions by thermospheric empirical models, using the data-minus-model variance as a proxy for model uncertainty. Our analysis focuses mainly on the power spectrum of the residuals, and we construct an empirical model of the variance as a function of time scale (from 1 hour to 10 years), altitude, and solar activity. We find that the power spectral density approximately follows a power-law process but with an enhancement near the 27-day solar rotation period. The residual variance increases monotonically with altitude between 250 and 550 km. There are two components to the variance dependence on solar activity: one component is 180 degrees out of phase (largest variance at solar minimum), and the other component lags 2 years behind solar maximum (largest variance in the descending phase of the solar cycle).
Efficient Reduction and Analysis of Model Predictive Error
NASA Astrophysics Data System (ADS)
Doherty, J.
2006-12-01
Most groundwater models are calibrated against historical measurements of head and other system states before being used to make predictions in a real-world context. Through the calibration process, parameter values are estimated or refined such that the model is able to reproduce historical behaviour of the system at pertinent observation points reasonably well. Predictions made by the model are deemed to have greater integrity because of this. Unfortunately, predictive integrity is not as easy to achieve as many groundwater practitioners would like to think. The level of parameterisation detail estimable through the calibration process (especially where estimation takes place on the basis of heads alone) is strictly limited, even where full use is made of modern mathematical regularisation techniques such as those encapsulated in the PEST calibration package. (Use of these mechanisms allows more information to be extracted from a calibration dataset than is possible using simpler regularisation devices such as zones of piecewise constancy.) Where a prediction depends on aspects of parameterisation detail that are simply not inferable through the calibration process (which is often the case for predictions related to contaminant movement, and/or many aspects of groundwater/surface water interaction), then that prediction may be just as much in error as it would have been if the model had not been calibrated at all. Model predictive error arises from two sources. These are (a) the presence of measurement noise within the calibration dataset through which linear combinations of parameters spanning the "calibration solution space" are inferred, and (b) the sensitivity of the prediction to members of the "calibration null space" spanned by linear combinations of parameters which are not inferable through the calibration process. The magnitude of the former contribution depends on the level of measurement noise. The magnitude of the latter contribution (which often dominates the former) depends on the "innate variability" of hydraulic properties within the model domain. Knowledge of both of these is a prerequisite for characterisation of the magnitude of possible model predictive error. Unfortunately, in most cases, such knowledge is incomplete and subjective. Nevertheless, useful analysis of model predictive error can still take place. The present paper briefly discusses the means by which mathematical regularisation can be employed in the model calibration process in order to extract as much information as possible on hydraulic property heterogeneity prevailing within the model domain, thereby reducing predictive error to the lowest that can be achieved on the basis of that dataset. It then demonstrates the means by which predictive error variance can be quantified based on information supplied by the regularised inversion process. Both linear and nonlinear predictive error variance analysis is demonstrated using a number of real-world and synthetic examples.
Preliminary genomic predictions of feed saved for 1.4 million Holsteins
USDA-ARS?s Scientific Manuscript database
Genomic predictions of transmitting ability (GPTAs) for residual feed intake (RFI) were computed using data from 4,621 42-day and 202 28-day feed intake trials of 3,947 U.S. Holsteins born 1999-2013 in 9 research herds. The 28-day records had 8.5% larger error variance than 42-day records and receiv...
Performance of chromatographic systems to model soil-water sorption.
Hidalgo-Rodríguez, Marta; Fuguet, Elisabet; Ràfols, Clara; Rosés, Martí
2012-08-24
A systematic approach for evaluating the goodness of chromatographic systems to model the sorption of neutral organic compounds by soil from water is presented in this work. It is based on the examination of the three sources of error that determine the overall variance obtained when soil-water partition coefficients are correlated against chromatographic retention factors: the variance of the soil-water sorption data, the variance of the chromatographic data, and the variance attributed to the dissimilarity between the two systems. These contributions of variance are easily predicted through the characterization of the systems by the solvation parameter model. According to this method, several chromatographic systems besides the reference octanol-water partition system have been selected to test their performance in the emulation of soil-water sorption. The results from the experimental correlations agree with the predicted variances. The high-performance liquid chromatography system based on an immobilized artificial membrane and the micellar electrokinetic chromatography systems of sodium dodecylsulfate and sodium taurocholate provide the most precise correlation models. They have shown to predict well soil-water sorption coefficients of several tested herbicides. Octanol-water partitions and high-performance liquid chromatography measurements using C18 columns are less suited for the estimation of soil-water partition coefficients. Copyright © 2012 Elsevier B.V. All rights reserved.
Wonnapinij, Passorn; Chinnery, Patrick F.; Samuels, David C.
2010-01-01
In cases of inherited pathogenic mitochondrial DNA (mtDNA) mutations, a mother and her offspring generally have large and seemingly random differences in the amount of mutated mtDNA that they carry. Comparisons of measured mtDNA mutation level variance values have become an important issue in determining the mechanisms that cause these large random shifts in mutation level. These variance measurements have been made with samples of quite modest size, which should be a source of concern because higher-order statistics, such as variance, are poorly estimated from small sample sizes. We have developed an analysis of the standard error of variance from a sample of size n, and we have defined error bars for variance measurements based on this standard error. We calculate variance error bars for several published sets of measurements of mtDNA mutation level variance and show how the addition of the error bars alters the interpretation of these experimental results. We compare variance measurements from human clinical data and from mouse models and show that the mutation level variance is clearly higher in the human data than it is in the mouse models at both the primary oocyte and offspring stages of inheritance. We discuss how the standard error of variance can be used in the design of experiments measuring mtDNA mutation level variance. Our results show that variance measurements based on fewer than 20 measurements are generally unreliable and ideally more than 50 measurements are required to reliably compare variances with less than a 2-fold difference. PMID:20362273
Criterion Predictability: Identifying Differences Between [r-squares
ERIC Educational Resources Information Center
Malgady, Robert G.
1976-01-01
An analysis of variance procedure for testing differences in r-squared, the coefficient of determination, across independent samples is proposed and briefly discussed. The principal advantage of the procedure is to minimize Type I error for follow-up tests of pairwise differences. (Author/JKS)
Spectral Analysis of Forecast Error Investigated with an Observing System Simulation Experiment
NASA Technical Reports Server (NTRS)
Prive, N. C.; Errico, Ronald M.
2015-01-01
The spectra of analysis and forecast error are examined using the observing system simulation experiment (OSSE) framework developed at the National Aeronautics and Space Administration Global Modeling and Assimilation Office (NASAGMAO). A global numerical weather prediction model, the Global Earth Observing System version 5 (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation, is cycled for two months with once-daily forecasts to 336 hours to generate a control case. Verification of forecast errors using the Nature Run as truth is compared with verification of forecast errors using self-analysis; significant underestimation of forecast errors is seen using self-analysis verification for up to 48 hours. Likewise, self analysis verification significantly overestimates the error growth rates of the early forecast, as well as mischaracterizing the spatial scales at which the strongest growth occurs. The Nature Run-verified error variances exhibit a complicated progression of growth, particularly for low wave number errors. In a second experiment, cycling of the model and data assimilation over the same period is repeated, but using synthetic observations with different explicitly added observation errors having the same error variances as the control experiment, thus creating a different realization of the control. The forecast errors of the two experiments become more correlated during the early forecast period, with correlations increasing for up to 72 hours before beginning to decrease.
NASA Astrophysics Data System (ADS)
De Felice, Matteo; Petitta, Marcello; Ruti, Paolo
2014-05-01
Photovoltaic diffusion is steadily growing on Europe, passing from a capacity of almost 14 GWp in 2011 to 21.5 GWp in 2012 [1]. Having accurate forecast is needed for planning and operational purposes, with the possibility to model and predict solar variability at different time-scales. This study examines the predictability of daily surface solar radiation comparing ECMWF operational forecasts with CM-SAF satellite measurements on the Meteosat (MSG) full disk domain. Operational forecasts used are the IFS system up to 10 days and the System4 seasonal forecast up to three months. Forecast are analysed considering average and variance of errors, showing error maps and average on specific domains with respect to prediction lead times. In all the cases, forecasts are compared with predictions obtained using persistence and state-of-art time-series models. We can observe a wide range of errors, with the performance of forecasts dramatically affected by orography and season. Lower errors are on southern Italy and Spain, with errors on some areas consistently under 10% up to ten days during summer (JJA). Finally, we conclude the study with some insight on how to "translate" the error on solar radiation to error on solar power production using available production data from solar power plants. [1] EurObserver, "Baromètre Photovoltaïque, Le journal des énergies renouvables, April 2012."
Harmsen, Wouter J; Ribbers, Gerard M; Slaman, Jorrit; Heijenbrok-Kal, Majanka H; Khajeh, Ladbon; van Kooten, Fop; Neggers, Sebastiaan J C M M; van den Berg-Emons, Rita J
2017-05-01
Peak oxygen uptake (VO 2peak ) established during progressive cardiopulmonary exercise testing (CPET) is the "gold-standard" for cardiorespiratory fitness. However, CPET measurements may be limited in patients with aneurysmal subarachnoid hemorrhage (a-SAH) by disease-related complaints, such as cardiovascular health-risks or anxiety. Furthermore, CPET with gas-exchange analyses require specialized knowledge and infrastructure with limited availability in most rehabilitation facilities. To determine whether an easy-to-administer six-minute walk test (6MWT) is a valid clinical alternative to progressive CPET in order to predict VO 2peak in individuals with a-SAH. Twenty-seven patients performed the 6MWT and CPET with gas-exchange analyses on a cycle ergometer. Univariate and multivariate regression models were made to investigate the predictability of VO 2peak from the six-minute walk distance (6MWD). Univariate regression showed that the 6MWD was strongly related to VO 2peak (r = 0.75, p < 0.001), with an explained variance of 56% and a prediction error of 4.12 ml/kg/min, representing 18% of mean VO 2peak . Adding age and sex to an extended multivariate regression model improved this relationship (r = 0.82, p < 0.001), with an explained variance of 67% and a prediction error of 3.67 ml/kg/min corresponding to 16% of mean VO 2peak . The 6MWT is an easy-to-administer submaximal exercise test that can be selected to estimate cardiorespiratory fitness at an aggregated level, in groups of patients with a-SAH, which may help to evaluate interventions in a clinical or research setting. However, the relatively large prediction error does not allow for an accurate prediction in individual patients.
Bouvet, J-M; Makouanzi, G; Cros, D; Vigneron, Ph
2016-01-01
Hybrids are broadly used in plant breeding and accurate estimation of variance components is crucial for optimizing genetic gain. Genome-wide information may be used to explore models designed to assess the extent of additive and non-additive variance and test their prediction accuracy for the genomic selection. Ten linear mixed models, involving pedigree- and marker-based relationship matrices among parents, were developed to estimate additive (A), dominance (D) and epistatic (AA, AD and DD) effects. Five complementary models, involving the gametic phase to estimate marker-based relationships among hybrid progenies, were developed to assess the same effects. The models were compared using tree height and 3303 single-nucleotide polymorphism markers from 1130 cloned individuals obtained via controlled crosses of 13 Eucalyptus urophylla females with 9 Eucalyptus grandis males. Akaike information criterion (AIC), variance ratios, asymptotic correlation matrices of estimates, goodness-of-fit, prediction accuracy and mean square error (MSE) were used for the comparisons. The variance components and variance ratios differed according to the model. Models with a parent marker-based relationship matrix performed better than those that were pedigree-based, that is, an absence of singularities, lower AIC, higher goodness-of-fit and accuracy and smaller MSE. However, AD and DD variances were estimated with high s.es. Using the same criteria, progeny gametic phase-based models performed better in fitting the observations and predicting genetic values. However, DD variance could not be separated from the dominance variance and null estimates were obtained for AA and AD effects. This study highlighted the advantages of progeny models using genome-wide information. PMID:26328760
Measurement error in epidemiologic studies of air pollution based on land-use regression models.
Basagaña, Xavier; Aguilera, Inmaculada; Rivera, Marcela; Agis, David; Foraster, Maria; Marrugat, Jaume; Elosua, Roberto; Künzli, Nino
2013-10-15
Land-use regression (LUR) models are increasingly used to estimate air pollution exposure in epidemiologic studies. These models use air pollution measurements taken at a small set of locations and modeling based on geographical covariates for which data are available at all study participant locations. The process of LUR model development commonly includes a variable selection procedure. When LUR model predictions are used as explanatory variables in a model for a health outcome, measurement error can lead to bias of the regression coefficients and to inflation of their variance. In previous studies dealing with spatial predictions of air pollution, bias was shown to be small while most of the effect of measurement error was on the variance. In this study, we show that in realistic cases where LUR models are applied to health data, bias in health-effect estimates can be substantial. This bias depends on the number of air pollution measurement sites, the number of available predictors for model selection, and the amount of explainable variability in the true exposure. These results should be taken into account when interpreting health effects from studies that used LUR models.
Vaskinn, Anja; Andersson, Stein; Østefjells, Tiril; Andreassen, Ole A; Sundet, Kjetil
2018-06-05
Theory of mind (ToM) can be divided into cognitive and affective ToM, and a distinction can be made between overmentalizing and undermentalizing errors. Research has shown that ToM in schizophrenia is associated with non-social and social cognition, and with clinical symptoms. In this study, we investigate cognitive and clinical predictors of different ToM processes. Ninety-one individuals with schizophrenia participated. ToM was measured with the Movie for the Assessment of Social Cognition (MASC) yielding six scores (total ToM, cognitive ToM, affective ToM, overmentalizing errors, undermentalizing errors and no mentalizing errors). Neurocognition was indexed by a composite score based on the non-social cognitive tests in the MATRICS Consensus Cognitive Battery (MCCB). Emotion perception was measured with Emotion in Biological Motion (EmoBio), a point-light walker task. Clinical symptoms were assessed with the Positive and Negative Syndrome Scale (PANSS). Seventy-one healthy control (HC) participants completed the MASC. Individuals with schizophrenia showed large impairments compared to HC for all MASC scores, except overmentalizing errors. Hierarchical regression analyses with the six different MASC scores as dependent variables revealed that MCCB was a significant predictor of all MASC scores, explaining 8-18% of the variance. EmoBio increased the explained variance significantly, to 17-28%, except for overmentalizing errors. PANSS excited symptoms increased explained variance for total ToM, affective ToM and no mentalizing errors. Both social and non-social cognition were significant predictors of ToM. Overmentalizing was only predicted by non-social cognition. Excited symptoms contributed to overall and affective ToM, and to no mentalizing errors. Copyright © 2018 Elsevier Inc. All rights reserved.
Understanding seasonal variability of uncertainty in hydrological prediction
NASA Astrophysics Data System (ADS)
Li, M.; Wang, Q. J.
2012-04-01
Understanding uncertainty in hydrological prediction can be highly valuable for improving the reliability of streamflow prediction. In this study, a monthly water balance model, WAPABA, in a Bayesian joint probability with error models are presented to investigate the seasonal dependency of prediction error structure. A seasonal invariant error model, analogous to traditional time series analysis, uses constant parameters for model error and account for no seasonal variations. In contrast, a seasonal variant error model uses a different set of parameters for bias, variance and autocorrelation for each individual calendar month. Potential connection amongst model parameters from similar months is not considered within the seasonal variant model and could result in over-fitting and over-parameterization. A hierarchical error model further applies some distributional restrictions on model parameters within a Bayesian hierarchical framework. An iterative algorithm is implemented to expedite the maximum a posterior (MAP) estimation of a hierarchical error model. Three error models are applied to forecasting streamflow at a catchment in southeast Australia in a cross-validation analysis. This study also presents a number of statistical measures and graphical tools to compare the predictive skills of different error models. From probability integral transform histograms and other diagnostic graphs, the hierarchical error model conforms better to reliability when compared to the seasonal invariant error model. The hierarchical error model also generally provides the most accurate mean prediction in terms of the Nash-Sutcliffe model efficiency coefficient and the best probabilistic prediction in terms of the continuous ranked probability score (CRPS). The model parameters of the seasonal variant error model are very sensitive to each cross validation, while the hierarchical error model produces much more robust and reliable model parameters. Furthermore, the result of the hierarchical error model shows that most of model parameters are not seasonal variant except for error bias. The seasonal variant error model is likely to use more parameters than necessary to maximize the posterior likelihood. The model flexibility and robustness indicates that the hierarchical error model has great potential for future streamflow predictions.
Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram
2018-05-01
DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
A Note on Some Characteristics and Correlates of the Meier Art Test of Aesthetic Perception.
ERIC Educational Resources Information Center
Stallings, William M.; Anderson, Frances E.
The reliability and the predictive and concurrent validity of the MATAP were investigated with the implicit goal of improving the prediction of course grades in the College of Fine and Applied Arts. It was found that reliability and validity coefficients were low, and it was suggested that the scoring system was a source of error variance. (MS)
Chan, Kelvin K W; Xie, Feng; Willan, Andrew R; Pullenayegum, Eleanor M
2017-04-01
Parameter uncertainty in value sets of multiattribute utility-based instruments (MAUIs) has received little attention previously. This false precision leads to underestimation of the uncertainty of the results of cost-effectiveness analyses. The aim of this study is to examine the use of multiple imputation as a method to account for this uncertainty of MAUI scoring algorithms. We fitted a Bayesian model with random effects for respondents and health states to the data from the original US EQ-5D-3L valuation study, thereby estimating the uncertainty in the EQ-5D-3L scoring algorithm. We applied these results to EQ-5D-3L data from the Commonwealth Fund (CWF) Survey for Sick Adults ( n = 3958), comparing the standard error of the estimated mean utility in the CWF population using the predictive distribution from the Bayesian mixed-effect model (i.e., incorporating parameter uncertainty in the value set) with the standard error of the estimated mean utilities based on multiple imputation and the standard error using the conventional approach of using MAUI (i.e., ignoring uncertainty in the value set). The mean utility in the CWF population based on the predictive distribution of the Bayesian model was 0.827 with a standard error (SE) of 0.011. When utilities were derived using the conventional approach, the estimated mean utility was 0.827 with an SE of 0.003, which is only 25% of the SE based on the full predictive distribution of the mixed-effect model. Using multiple imputation with 20 imputed sets, the mean utility was 0.828 with an SE of 0.011, which is similar to the SE based on the full predictive distribution. Ignoring uncertainty of the predicted health utilities derived from MAUIs could lead to substantial underestimation of the variance of mean utilities. Multiple imputation corrects for this underestimation so that the results of cost-effectiveness analyses using MAUIs can report the correct degree of uncertainty.
Technical Note: Introduction of variance component analysis to setup error analysis in radiotherapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matsuo, Yukinori, E-mail: ymatsuo@kuhp.kyoto-u.ac.
Purpose: The purpose of this technical note is to introduce variance component analysis to the estimation of systematic and random components in setup error of radiotherapy. Methods: Balanced data according to the one-factor random effect model were assumed. Results: Analysis-of-variance (ANOVA)-based computation was applied to estimate the values and their confidence intervals (CIs) for systematic and random errors and the population mean of setup errors. The conventional method overestimates systematic error, especially in hypofractionated settings. The CI for systematic error becomes much wider than that for random error. The ANOVA-based estimation can be extended to a multifactor model considering multiplemore » causes of setup errors (e.g., interpatient, interfraction, and intrafraction). Conclusions: Variance component analysis may lead to novel applications to setup error analysis in radiotherapy.« less
Low-dimensional Representation of Error Covariance
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan
2000-01-01
Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.
Bohmanova, J; Miglior, F; Jamrozik, J; Misztal, I; Sullivan, P G
2008-09-01
A random regression model with both random and fixed regressions fitted by Legendre polynomials of order 4 was compared with 3 alternative models fitting linear splines with 4, 5, or 6 knots. The effects common for all models were a herd-test-date effect, fixed regressions on days in milk (DIM) nested within region-age-season of calving class, and random regressions for additive genetic and permanent environmental effects. Data were test-day milk, fat and protein yields, and SCS recorded from 5 to 365 DIM during the first 3 lactations of Canadian Holstein cows. A random sample of 50 herds consisting of 96,756 test-day records was generated to estimate variance components within a Bayesian framework via Gibbs sampling. Two sets of genetic evaluations were subsequently carried out to investigate performance of the 4 models. Models were compared by graphical inspection of variance functions, goodness of fit, error of prediction of breeding values, and stability of estimated breeding values. Models with splines gave lower estimates of variances at extremes of lactations than the model with Legendre polynomials. Differences among models in goodness of fit measured by percentages of squared bias, correlations between predicted and observed records, and residual variances were small. The deviance information criterion favored the spline model with 6 knots. Smaller error of prediction and higher stability of estimated breeding values were achieved by using spline models with 5 and 6 knots compared with the model with Legendre polynomials. In general, the spline model with 6 knots had the best overall performance based upon the considered model comparison criteria.
Fleming, Kevin K; Bandy, Carole L; Kimble, Matthew O
2010-01-01
The decision to shoot a gun engages executive control processes that can be biased by cultural stereotypes and perceived threat. The neural locus of the decision to shoot is likely to be found in the anterior cingulate cortex (ACC), where cognition and affect converge. Male military cadets at Norwich University (N=37) performed a weapon identification task in which they made rapid decisions to shoot when images of guns appeared briefly on a computer screen. Reaction times, error rates, and electroencephalogram (EEG) activity were recorded. Cadets reacted more quickly and accurately when guns were primed by images of Middle-Eastern males wearing traditional clothing. However, cadets also made more false positive errors when tools were primed by these images. Error-related negativity (ERN) was measured for each response. Deeper ERNs were found in the medial-frontal cortex following false positive responses. Cadets who made fewer errors also produced deeper ERNs, indicating stronger executive control. Pupil size was used to measure autonomic arousal related to perceived threat. Images of Middle-Eastern males in traditional clothing produced larger pupil sizes. An image of Osama bin Laden induced the largest pupil size, as would be predicted for the exemplar of Middle East terrorism. Cadets who showed greater increases in pupil size also made more false positive errors. Regression analyses were performed to evaluate predictions based on current models of perceived threat, stereotype activation, and cognitive control. Measures of pupil size (perceived threat) and ERN (cognitive control) explained significant proportions of the variance in false positive errors to Middle-Eastern males in traditional clothing, while measures of reaction time, signal detection response bias, and stimulus discriminability explained most of the remaining variance.
Fleming, Kevin K.; Bandy, Carole L.; Kimble, Matthew O.
2014-01-01
The decision to shoot engages executive control processes that can be biased by cultural stereotypes and perceived threat. The neural locus of the decision to shoot is likely to be found in the anterior cingulate cortex (ACC) where cognition and affect converge. Male military cadets at Norwich University (N=37) performed a weapon identification task in which they made rapid decisions to shoot when images of guns appeared briefly on a computer screen. Reaction times, error rates, and EEG activity were recorded. Cadets reacted more quickly and accurately when guns were primed by images of middle-eastern males wearing traditional clothing. However, cadets also made more false positive errors when tools were primed by these images. Error-related negativity (ERN) was measured for each response. Deeper ERN’s were found in the medial-frontal cortex following false positive responses. Cadets who made fewer errors also produced deeper ERN’s, indicating stronger executive control. Pupil size was used to measure autonomic arousal related to perceived threat. Images of middle-eastern males in traditional clothing produced larger pupil sizes. An image of Osama bin Laden induced the largest pupil size, as would be predicted for the exemplar of Middle East terrorism. Cadets who showed greater increases in pupil size also made more false positive errors. Regression analyses were performed to evaluate predictions based on current models of perceived threat, stereotype activation, and cognitive control. Measures of pupil size (perceived threat) and ERN (cognitive control) explained significant proportions of the variance in false positive errors to middle-eastern males in traditional clothing, while measures of reaction time, signal detection response bias, and stimulus discriminability explained most of the remaining variance. PMID:19813139
Nazemi, S Majid; Kalajahi, S Mehrdad Hosseini; Cooper, David M L; Kontulainen, Saija A; Holdsworth, David W; Masri, Bassam A; Wilson, David R; Johnston, James D
2017-07-05
Previously, a finite element (FE) model of the proximal tibia was developed and validated against experimentally measured local subchondral stiffness. This model indicated modest predictions of stiffness (R 2 =0.77, normalized root mean squared error (RMSE%)=16.6%). Trabecular bone though was modeled with isotropic material properties despite its orthotropic anisotropy. The objective of this study was to identify the anisotropic FE modeling approach which best predicted (with largest explained variance and least amount of error) local subchondral bone stiffness at the proximal tibia. Local stiffness was measured at the subchondral surface of 13 medial/lateral tibial compartments using in situ macro indentation testing. An FE model of each specimen was generated assuming uniform anisotropy with 14 different combinations of cortical- and tibial-specific density-modulus relationships taken from the literature. Two FE models of each specimen were also generated which accounted for the spatial variation of trabecular bone anisotropy directly from clinical CT images using grey-level structure tensor and Cowin's fabric-elasticity equations. Stiffness was calculated using FE and compared to measured stiffness in terms of R 2 and RMSE%. The uniform anisotropic FE model explained 53-74% of the measured stiffness variance, with RMSE% ranging from 12.4 to 245.3%. The models which accounted for spatial variation of trabecular bone anisotropy predicted 76-79% of the variance in stiffness with RMSE% being 11.2-11.5%. Of the 16 evaluated finite element models in this study, the combination of Synder and Schneider (for cortical bone) and Cowin's fabric-elasticity equations (for trabecular bone) best predicted local subchondral bone stiffness. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lin, P.-S.; Chiou, B.; Abrahamson, N.; Walling, M.; Lee, C.-T.; Cheng, C.-T.
2011-01-01
In this study, we quantify the reduction in the standard deviation for empirical ground-motion prediction models by removing ergodic assumption.We partition the modeling error (residual) into five components, three of which represent the repeatable source-location-specific, site-specific, and path-specific deviations from the population mean. A variance estimation procedure of these error components is developed for use with a set of recordings from earthquakes not heavily clustered in space.With most source locations and propagation paths sampled only once, we opt to exploit the spatial correlation of residuals to estimate the variances associated with the path-specific and the source-location-specific deviations. The estimation procedure is applied to ground-motion amplitudes from 64 shallow earthquakes in Taiwan recorded at 285 sites with at least 10 recordings per site. The estimated variance components are used to quantify the reduction in aleatory variability that can be used in hazard analysis for a single site and for a single path. For peak ground acceleration and spectral accelerations at periods of 0.1, 0.3, 0.5, 1.0, and 3.0 s, we find that the singlesite standard deviations are 9%-14% smaller than the total standard deviation, whereas the single-path standard deviations are 39%-47% smaller.
[Theory, method and application of method R on estimation of (co)variance components].
Liu, Wen-Zhong
2004-07-01
Theory, method and application of Method R on estimation of (co)variance components were reviewed in order to make the method be reasonably used. Estimation requires R values,which are regressions of predicted random effects that are calculated using complete dataset on predicted random effects that are calculated using random subsets of the same data. By using multivariate iteration algorithm based on a transformation matrix,and combining with the preconditioned conjugate gradient to solve the mixed model equations, the computation efficiency of Method R is much improved. Method R is computationally inexpensive,and the sampling errors and approximate credible intervals of estimates can be obtained. Disadvantages of Method R include a larger sampling variance than other methods for the same data,and biased estimates in small datasets. As an alternative method, Method R can be used in larger datasets. It is necessary to study its theoretical properties and broaden its application range further.
NASA Astrophysics Data System (ADS)
Gokkaya, Kemal
The use of satellite and airborne remote sensing data to predict foliar macronutrients and pigments for a boreal mixedwood forest composed of black and white spruce, balsam fir, northern white cedar, white birch, and trembling aspen was investigated. Specifically, imaging spectroscopy (IS) and light detection and ranging (LiDAR) are used to model the foliar N:P ratio, macronutrients (N, P, K, Ca, Mg) and chlorophyll. Measurement of both foliar macronutrients and foliar chlorophyll provide critical information about plant physiological and nutritional status, stress, as well as ecosystem processes such as carbon (C) exchange (photosynthesis and net primary production), decomposition and nutrient cycling. Results show that airborne and spaceborne IS data explained approximately 70% of the variance in the canopy N:P ratio with predictions errors of less than 8% in two consecutive years. LiDAR models explained more than 50% of the variance in the canopy N:P ratio with similar predictions errors. Predictive models using spaceborne Hyperion IS data were developed with adjusted R2 values of 0.73, 0.72, 0.62, 0.25, and 0.67 for N, P, K, Ca and Mg, respectively. The LiDAR model explained 80% of the variance in canopy Ca concentration with an RMSE of less than 10%, suggesting strong correlations between forest height and Ca. Two IS derivative indices emerged as good predictors of chlorophyll across time and space. When the models of these two indices with the same parameters as generated from Hyperion data were applied to other years' data for chlorophyll concentration prediction, they could explain 71, 63 and 6% and 61, 54 and 8 % of the variation in chlorophyll concentration in 2002, 2004 and 2008, respectively with prediction errors ranging from 11.7% to 14.6%. Results demonstrate that the N:P ratio, N, P, K, Mg and chlorophyll can be modeled by spaceborne IS data and Ca can only be predicted by LiDAR data in the canopy of this forest. The ability to model the N:P ratio and macronutrients using spaceborne Hyperion data demonstrates the potential for mapping them at the canopy scale across larger geographic areas and being able to integrate them in future studies of ecosystem processes.
Comparison of structural and least-squares lines for estimating geologic relations
Williams, G.P.; Troutman, B.M.
1990-01-01
Two different goals in fitting straight lines to data are to estimate a "true" linear relation (physical law) and to predict values of the dependent variable with the smallest possible error. Regarding the first goal, a Monte Carlo study indicated that the structural-analysis (SA) method of fitting straight lines to data is superior to the ordinary least-squares (OLS) method for estimating "true" straight-line relations. Number of data points, slope and intercept of the true relation, and variances of the errors associated with the independent (X) and dependent (Y) variables influence the degree of agreement. For example, differences between the two line-fitting methods decrease as error in X becomes small relative to error in Y. Regarding the second goal-predicting the dependent variable-OLS is better than SA. Again, the difference diminishes as X takes on less error relative to Y. With respect to estimation of slope and intercept and prediction of Y, agreement between Monte Carlo results and large-sample theory was very good for sample sizes of 100, and fair to good for sample sizes of 20. The procedures and error measures are illustrated with two geologic examples. ?? 1990 International Association for Mathematical Geology.
Luoma, Pekka; Natschläger, Thomas; Malli, Birgit; Pawliczek, Marcin; Brandstetter, Markus
2018-05-12
A model recalibration method based on additive Partial Least Squares (PLS) regression is generalized for multi-adjustment scenarios of independent variance sources (referred to as additive PLS - aPLS). aPLS allows for effortless model readjustment under changing measurement conditions and the combination of independent variance sources with the initial model by means of additive modelling. We demonstrate these distinguishing features on two NIR spectroscopic case-studies. In case study 1 aPLS was used as a readjustment method for an emerging offset. The achieved RMS error of prediction (1.91 a.u.) was of similar level as before the offset occurred (2.11 a.u.). In case-study 2 a calibration combining different variance sources was conducted. The achieved performance was of sufficient level with an absolute error being better than 0.8% of the mean concentration, therefore being able to compensate negative effects of two independent variance sources. The presented results show the applicability of the aPLS approach. The main advantages of the method are that the original model stays unadjusted and that the modelling is conducted on concrete changes in the spectra thus supporting efficient (in most cases straightforward) modelling. Additionally, the method is put into context of existing machine learning algorithms. Copyright © 2018 Elsevier B.V. All rights reserved.
Ercanli, İlker; Kahriman, Aydın
2015-03-01
We assessed the effect of stand structural diversity, including the Shannon, improved Shannon, Simpson, McIntosh, Margelef, and Berger-Parker indices, on stand aboveground biomass (AGB) and developed statistical prediction models for the stand AGB values, including stand structural diversity indices and some stand attributes. The AGB prediction model, including only stand attributes, accounted for 85 % of the total variance in AGB (R (2)) with an Akaike's information criterion (AIC) of 807.2407, Bayesian information criterion (BIC) of 809.5397, Schwarz Bayesian criterion (SBC) of 818.0426, and root mean square error (RMSE) of 38.529 Mg. After inclusion of the stand structural diversity into the model structure, considerable improvement was observed in statistical accuracy, including 97.5 % of the total variance in AGB, with an AIC of 614.1819, BIC of 617.1242, SBC of 633.0853, and RMSE of 15.8153 Mg. The predictive fitting results indicate that some indices describing the stand structural diversity can be employed as significant independent variables to predict the AGB production of the Scotch pine stand. Further, including the stand diversity indices in the AGB prediction model with the stand attributes provided important predictive contributions in estimating the total variance in AGB.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Newman, Jennifer F.; Clifton, Andrew
Currently, cup anemometers on meteorological towers are used to measure wind speeds and turbulence intensity to make decisions about wind turbine class and site suitability; however, as modern turbine hub heights increase and wind energy expands to complex and remote sites, it becomes more difficult and costly to install meteorological towers at potential sites. As a result, remote-sensing devices (e.g., lidars) are now commonly used by wind farm managers and researchers to estimate the flow field at heights spanned by a turbine. Although lidars can accurately estimate mean wind speeds and wind directions, there is still a large amount ofmore » uncertainty surrounding the measurement of turbulence using these devices. Errors in lidar turbulence estimates are caused by a variety of factors, including instrument noise, volume averaging, and variance contamination, in which the magnitude of these factors is highly dependent on measurement height and atmospheric stability. As turbulence has a large impact on wind power production, errors in turbulence measurements will translate into errors in wind power prediction. The impact of using lidars rather than cup anemometers for wind power prediction must be understood if lidars are to be considered a viable alternative to cup anemometers.In this poster, the sensitivity of power prediction error to typical lidar turbulence measurement errors is assessed. Turbulence estimates from a vertically profiling WINDCUBE v2 lidar are compared to high-resolution sonic anemometer measurements at field sites in Oklahoma and Colorado to determine the degree of lidar turbulence error that can be expected under different atmospheric conditions. These errors are then incorporated into a power prediction model to estimate the sensitivity of power prediction error to turbulence measurement error. Power prediction models, including the standard binning method and a random forest method, were developed using data from the aeroelastic simulator FAST for a 1.5 MW turbine. The impact of lidar turbulence error on the predicted power from these different models is examined to determine the degree of turbulence measurement accuracy needed for accurate power prediction.« less
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
NASA Astrophysics Data System (ADS)
Chang, Guobin; Xu, Tianhe; Yao, Yifei; Wang, Qianxin
2018-01-01
In order to incorporate the time smoothness of ionospheric delay to aid the cycle slip detection, an adaptive Kalman filter is developed based on variance component estimation. The correlations between measurements at neighboring epochs are fully considered in developing a filtering algorithm for colored measurement noise. Within this filtering framework, epoch-differenced ionospheric delays are predicted. Using this prediction, the potential cycle slips are repaired for triple-frequency signals of global navigation satellite systems. Cycle slips are repaired in a stepwise manner; i.e., for two extra wide lane combinations firstly and then for the third frequency. In the estimation for the third frequency, a stochastic model is followed in which the correlations between the ionospheric delay prediction errors and the errors in the epoch-differenced phase measurements are considered. The implementing details of the proposed method are tabulated. A real BeiDou Navigation Satellite System data set is used to check the performance of the proposed method. Most cycle slips, no matter trivial or nontrivial, can be estimated in float values with satisfactorily high accuracy and their integer values can hence be correctly obtained by simple rounding. To be more specific, all manually introduced nontrivial cycle slips are correctly repaired.
Response Monitoring and Adjustment: Differential Relations with Psychopathic Traits
Bresin, Konrad; Finy, M. Sima; Sprague, Jenessa; Verona, Edelyn
2014-01-01
Studies on the relation between psychopathy and cognitive functioning often show mixed results, partially because different factors of psychopathy have not been considered fully. Based on previous research, we predicted divergent results based on a two-factor model of psychopathy (interpersonal-affective traits and impulsive-antisocial traits). Specifically, we predicted that the unique variance of interpersonal-affective traits would be related to increased monitoring (i.e., error-related negativity) and adjusting to errors (i.e., post-error slowing), whereas impulsive-antisocial traits would be related to reductions in these processes. Three studies using a diverse selection of assessment tools, samples, and methods are presented to identify response monitoring correlates of the two main factors of psychopathy. In Studies 1 (undergraduates), 2 (adolescents), and 3 (offenders), interpersonal-affective traits were related to increased adjustment following errors and, in Study 3, to enhanced monitoring of errors. Impulsive-antisocial traits were not consistently related to error adjustment across the studies, although these traits were related to a deficient monitoring of errors in Study 3. The results may help explain previous mixed findings and advance implications for etiological models of psychopathy. PMID:24933282
NASA Astrophysics Data System (ADS)
El-Diasty, M.; El-Rabbany, A.; Pagiatakis, S.
2007-11-01
We examine the effect of varying the temperature points on MEMS inertial sensors' noise models using Allan variance and least-squares spectral analysis (LSSA). Allan variance is a method of representing root-mean-square random drift error as a function of averaging times. LSSA is an alternative to the classical Fourier methods and has been applied successfully by a number of researchers in the study of the noise characteristics of experimental series. Static data sets are collected at different temperature points using two MEMS-based IMUs, namely MotionPakII and Crossbow AHRS300CC. The performance of the two MEMS inertial sensors is predicted from the Allan variance estimation results at different temperature points and the LSSA is used to study the noise characteristics and define the sensors' stochastic model parameters. It is shown that the stochastic characteristics of MEMS-based inertial sensors can be identified using Allan variance estimation and LSSA and the sensors' stochastic model parameters are temperature dependent. Also, the Kaiser window FIR low-pass filter is used to investigate the effect of de-noising stage on the stochastic model. It is shown that the stochastic model is also dependent on the chosen cut-off frequency.
Spatio-temporal error growth in the multi-scale Lorenz'96 model
NASA Astrophysics Data System (ADS)
Herrera, S.; Fernández, J.; Rodríguez, M. A.; Gutiérrez, J. M.
2010-07-01
The influence of multiple spatio-temporal scales on the error growth and predictability of atmospheric flows is analyzed throughout the paper. To this aim, we consider the two-scale Lorenz'96 model and study the interplay of the slow and fast variables on the error growth dynamics. It is shown that when the coupling between slow and fast variables is weak the slow variables dominate the evolution of fluctuations whereas in the case of strong coupling the fast variables impose a non-trivial complex error growth pattern on the slow variables with two different regimes, before and after saturation of fast variables. This complex behavior is analyzed using the recently introduced Mean-Variance Logarithmic (MVL) diagram.
On the Likely Utility of Hybrid Weights Optimized for Variances in Hybrid Error Covariance Models
NASA Astrophysics Data System (ADS)
Satterfield, E.; Hodyss, D.; Kuhl, D.; Bishop, C. H.
2017-12-01
Because of imperfections in ensemble data assimilation schemes, one cannot assume that the ensemble covariance is equal to the true error covariance of a forecast. Previous work demonstrated how information about the distribution of true error variances given an ensemble sample variance can be revealed from an archive of (observation-minus-forecast, ensemble-variance) data pairs. Here, we derive a simple and intuitively compelling formula to obtain the mean of this distribution of true error variances given an ensemble sample variance from (observation-minus-forecast, ensemble-variance) data pairs produced by a single run of a data assimilation system. This formula takes the form of a Hybrid weighted average of the climatological forecast error variance and the ensemble sample variance. Here, we test the extent to which these readily obtainable weights can be used to rapidly optimize the covariance weights used in Hybrid data assimilation systems that employ weighted averages of static covariance models and flow-dependent ensemble based covariance models. Univariate data assimilation and multi-variate cycling ensemble data assimilation are considered. In both cases, it is found that our computationally efficient formula gives Hybrid weights that closely approximate the optimal weights found through the simple but computationally expensive process of testing every plausible combination of weights.
On the internal target model in a tracking task
NASA Technical Reports Server (NTRS)
Caglayan, A. K.; Baron, S.
1981-01-01
An optimal control model for predicting operator's dynamic responses and errors in target tracking ability is summarized. The model, which predicts asymmetry in the tracking data, is dependent on target maneuvers and trajectories. Gunners perception, decision making, control, and estimate of target positions and velocity related to crossover intervals are discussed. The model provides estimates for means, standard deviations, and variances for variables investigated and for operator estimates of future target positions and velocities.
Post-Modeling Histogram Matching of Maps Produced Using Regression Trees
Andrew J. Lister; Tonya W. Lister
2006-01-01
Spatial predictive models often use statistical techniques that in some way rely on averaging of values. Estimates from linear modeling are known to be susceptible to truncation of variance when the independent (predictor) variables are measured with error. A straightforward post-processing technique (histogram matching) for attempting to mitigate this effect is...
Improved uncertainty quantification in nondestructive assay for nonproliferation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burr, Tom; Croft, Stephen; Jarman, Ken
2016-12-01
This paper illustrates methods to improve uncertainty quantification (UQ) for non-destructive assay (NDA) measurements used in nuclear nonproliferation. First, it is shown that current bottom-up UQ applied to calibration data is not always adequate, for three main reasons: (1) Because there are errors in both the predictors and the response, calibration involves a ratio of random quantities, and calibration data sets in NDA usually consist of only a modest number of samples (3–10); therefore, asymptotic approximations involving quantities needed for UQ such as means and variances are often not sufficiently accurate; (2) Common practice overlooks that calibration implies a partitioningmore » of total error into random and systematic error, and (3) In many NDA applications, test items exhibit non-negligible departures in physical properties from calibration items, so model-based adjustments are used, but item-specific bias remains in some data. Therefore, improved bottom-up UQ using calibration data should predict the typical magnitude of item-specific bias, and the suggestion is to do so by including sources of item-specific bias in synthetic calibration data that is generated using a combination of modeling and real calibration data. Second, for measurements of the same nuclear material item by both the facility operator and international inspectors, current empirical (top-down) UQ is described for estimating operator and inspector systematic and random error variance components. A Bayesian alternative is introduced that easily accommodates constraints on variance components, and is more robust than current top-down methods to the underlying measurement error distributions.« less
NASA Technical Reports Server (NTRS)
Menard, Richard; Chang, Lang-Ping
1998-01-01
A Kalman filter system designed for the assimilation of limb-sounding observations of stratospheric chemical tracers, which has four tunable covariance parameters, was developed in Part I (Menard et al. 1998) The assimilation results of CH4 observations from the Cryogenic Limb Array Etalon Sounder instrument (CLAES) and the Halogen Observation Experiment instrument (HALOE) on board of the Upper Atmosphere Research Satellite are described in this paper. A robust (chi)(sup 2) criterion, which provides a statistical validation of the forecast and observational error covariances, was used to estimate the tunable variance parameters of the system. In particular, an estimate of the model error variance was obtained. The effect of model error on the forecast error variance became critical after only three days of assimilation of CLAES observations, although it took 14 days of forecast to double the initial error variance. We further found that the model error due to numerical discretization as arising in the standard Kalman filter algorithm, is comparable in size to the physical model error due to wind and transport modeling errors together. Separate assimilations of CLAES and HALOE observations were compared to validate the state estimate away from the observed locations. A wave-breaking event that took place several thousands of kilometers away from the HALOE observation locations was well captured by the Kalman filter due to highly anisotropic forecast error correlations. The forecast error correlation in the assimilation of the CLAES observations was found to have a structure similar to that in pure forecast mode except for smaller length scales. Finally, we have conducted an analysis of the variance and correlation dynamics to determine their relative importance in chemical tracer assimilation problems. Results show that the optimality of a tracer assimilation system depends, for the most part, on having flow-dependent error correlation rather than on evolving the error variance.
Prediction of flow duration curves for ungauged basins
NASA Astrophysics Data System (ADS)
Atieh, Maya; Taylor, Graham; M. A. Sattar, Ahmed; Gharabaghi, Bahram
2017-02-01
This study presents novel models for prediction of flow Duration Curves (FDCs) at ungauged basins using artificial neural networks (ANN) and Gene Expression Programming (GEP) trained and tested using historical flow records from 171 unregulated and 89 regulated basins across North America. For the 89 regulated basins, FDCs were generated for both before and after flow regulation. Topographic, climatic, and land use characteristics are used to develop relationships between these basin characteristics and FDC statistical distribution parameters: mean (m) and variance (ν). The two main hypotheses that flow regulation has negligible effect on the mean (m) while it the variance (ν) were confirmed. The novel GEP model that predicts the mean (GEP-m) performed very well with high R2 (0.9) and D (0.95) values and low RAE value of 0.25. The simple regression model that predicts the variance (REG-v) was developed as a function of the mean (m) and a flow regulation index (R). The measured performance and uncertainty analysis indicated that the ANN-m was the best performing model with R2 (0.97), RAE (0.21), D (0.93) and the lowest 95% confidence prediction error interval (+0.22 to +3.49). Both GEP and ANN models were most sensitive to drainage area followed by mean annual precipitation, apportionment entropy disorder index, and shape factor.
Reducing hydrologic model uncertainty in monthly streamflow predictions using multimodel combination
NASA Astrophysics Data System (ADS)
Li, Weihua; Sankarasubramanian, A.
2012-12-01
Model errors are inevitable in any prediction exercise. One approach that is currently gaining attention in reducing model errors is by combining multiple models to develop improved predictions. The rationale behind this approach primarily lies on the premise that optimal weights could be derived for each model so that the developed multimodel predictions will result in improved predictions. A new dynamic approach (MM-1) to combine multiple hydrological models by evaluating their performance/skill contingent on the predictor state is proposed. We combine two hydrological models, "abcd" model and variable infiltration capacity (VIC) model, to develop multimodel streamflow predictions. To quantify precisely under what conditions the multimodel combination results in improved predictions, we compare multimodel scheme MM-1 with optimal model combination scheme (MM-O) by employing them in predicting the streamflow generated from a known hydrologic model (abcd model orVICmodel) with heteroscedastic error variance as well as from a hydrologic model that exhibits different structure than that of the candidate models (i.e., "abcd" model or VIC model). Results from the study show that streamflow estimated from single models performed better than multimodels under almost no measurement error. However, under increased measurement errors and model structural misspecification, both multimodel schemes (MM-1 and MM-O) consistently performed better than the single model prediction. Overall, MM-1 performs better than MM-O in predicting the monthly flow values as well as in predicting extreme monthly flows. Comparison of the weights obtained from each candidate model reveals that as measurement errors increase, MM-1 assigns weights equally for all the models, whereas MM-O assigns higher weights for always the best-performing candidate model under the calibration period. Applying the multimodel algorithms for predicting streamflows over four different sites revealed that MM-1 performs better than all single models and optimal model combination scheme, MM-O, in predicting the monthly flows as well as the flows during wetter months.
Biased interpretation and memory in children with varying levels of spider fear.
Klein, Anke M; Titulaer, Geraldine; Simons, Carlijn; Allart, Esther; de Gier, Erwin; Bögels, Susan M; Becker, Eni S; Rinck, Mike
2014-01-01
This study investigated multiple cognitive biases in children simultaneously, to investigate whether spider-fearful children display an interpretation bias, a recall bias, and source monitoring errors, and whether these biases are specific for spider-related materials. Furthermore, the independent ability of these biases to predict spider fear was investigated. A total of 121 children filled out the Spider Anxiety and Disgust Screening for Children (SADS-C), and they performed an interpretation task, a memory task, and a Behavioural Assessment Test (BAT). As expected, a specific interpretation bias was found: Spider-fearful children showed more negative interpretations of ambiguous spider-related scenarios, but not of other scenarios. We also found specific source monitoring errors: Spider-fearful children made more fear-related source monitoring errors for the spider-related scenarios, but not for the other scenarios. Only limited support was found for a recall bias. Finally, interpretation bias, recall bias, and source monitoring errors predicted unique variance components of spider fear.
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.; Marino, J. T., Jr.
1974-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-emperical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. Bit error probabilities for non-optimum threshold detection system were also investigated.
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.
1975-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-empirical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. The bit error probabilities for nonoptimum threshold detection systems were also investigated.
Optical phase-locked loop (OPLL) for free-space laser communications with heterodyne detection
NASA Technical Reports Server (NTRS)
Win, Moe Z.; Chen, Chien-Chung; Scholtz, Robert A.
1991-01-01
Several advantages of coherent free-space optical communications are outlined. Theoretical analysis is formulated for an OPLL disturbed by shot noise, modulation noise, and frequency noise consisting of a white component, a 1/f component, and a 1/f-squared component. Each of the noise components is characterized by its associated power spectral density. It is shown that the effect of modulation depends only on the ratio of loop bandwidth and data rate, and is negligible for an OPLL with loop bandwidth smaller than one fourth the data rate. Total phase error variance as a function of loop bandwidth is displayed for several values of carrier signal to noise ratio. Optimal loop bandwidth is also calculated as a function of carrier signal to noise ratio. An OPLL experiment is performed, where it is shown that the measured phase error variance closely matches the theoretical predictions.
Comment on Hoffman and Rovine (2007): SPSS MIXED can estimate models with heterogeneous variances.
Weaver, Bruce; Black, Ryan A
2015-06-01
Hoffman and Rovine (Behavior Research Methods, 39:101-117, 2007) have provided a very nice overview of how multilevel models can be useful to experimental psychologists. They included two illustrative examples and provided both SAS and SPSS commands for estimating the models they reported. However, upon examining the SPSS syntax for the models reported in their Table 3, we found no syntax for models 2B and 3B, both of which have heterogeneous error variances. Instead, there is syntax that estimates similar models with homogeneous error variances and a comment stating that SPSS does not allow heterogeneous errors. But that is not correct. We provide SPSS MIXED commands to estimate models 2B and 3B with heterogeneous error variances and obtain results nearly identical to those reported by Hoffman and Rovine in their Table 3. Therefore, contrary to the comment in Hoffman and Rovine's syntax file, SPSS MIXED can estimate models with heterogeneous error variances.
Abdollahi-Arpanahi, Rostam; Morota, Gota; Valente, Bruno D; Kranis, Andreas; Rosa, Guilherme J M; Gianola, Daniel
2016-02-03
Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.
Application of adaptive Kalman filter in vehicle laser Doppler velocimetry
NASA Astrophysics Data System (ADS)
Fan, Zhe; Sun, Qiao; Du, Lei; Bai, Jie; Liu, Jingyun
2018-03-01
Due to the variation of road conditions and motor characteristics of vehicle, great root-mean-square (rms) error and outliers would be caused. Application of Kalman filter in laser Doppler velocimetry(LDV) is important to improve the velocity measurement accuracy. In this paper, the state-space model is built by using current statistical model. A strategy containing two steps is adopted to make the filter adaptive and robust. First, the acceleration variance is adaptively adjusted by using the difference of predictive observation and measured observation. Second, the outliers would be identified and the measured noise variance would be adjusted according to the orthogonal property of innovation to reduce the impaction of outliers. The laboratory rotating table experiments show that adaptive Kalman filter greatly reduces the rms error from 0.59 cm/s to 0.22 cm/s and has eliminated all the outliers. Road experiments compared with a microwave radar show that the rms error of LDV is 0.0218 m/s, and it proves that the adaptive Kalman filtering is suitable for vehicle speed signal processing.
NASA Technical Reports Server (NTRS)
Koster, Randal D.; Walker, Gregory K.; Mahanama, Sarith P.; Reichle, Rolf H.
2013-01-01
Offline simulations over the conterminous United States (CONUS) with a land surface model are used to address two issues relevant to the forecasting of large-scale seasonal streamflow: (i) the extent to which errors in soil moisture initialization degrade streamflow forecasts, and (ii) the extent to which a realistic increase in the spatial resolution of forecasted precipitation would improve streamflow forecasts. The addition of error to a soil moisture initialization field is found to lead to a nearly proportional reduction in streamflow forecast skill. The linearity of the response allows the determination of a lower bound for the increase in streamflow forecast skill achievable through improved soil moisture estimation, e.g., through satellite-based soil moisture measurements. An increase in the resolution of precipitation is found to have an impact on large-scale streamflow forecasts only when evaporation variance is significant relative to the precipitation variance. This condition is met only in the western half of the CONUS domain. Taken together, the two studies demonstrate the utility of a continental-scale land surface modeling system as a tool for addressing the science of hydrological prediction.
Modeling the subfilter scalar variance for large eddy simulation in forced isotropic turbulence
NASA Astrophysics Data System (ADS)
Cheminet, Adam; Blanquart, Guillaume
2011-11-01
Static and dynamic model for the subfilter scalar variance in homogeneous isotropic turbulence are investigated using direct numerical simulations (DNS) of a lineary forced passive scalar field. First, we introduce a new scalar forcing technique conditioned only on the scalar field which allows the fluctuating scalar field to reach a statistically stationary state. Statistical properties, including 2nd and 3rd statistical moments, spectra, and probability density functions of the scalar field have been analyzed. Using this technique, we performed constant density and variable density DNS of scalar mixing in isotropic turbulence. The results are used in an a-priori study of scalar variance models. Emphasis is placed on further studying the dynamic model introduced by G. Balarac, H. Pitsch and V. Raman [Phys. Fluids 20, (2008)]. Scalar variance models based on Bedford and Yeo's expansion are accurate for small filter width but errors arise in the inertial subrange. Results suggest that a constant coefficient computed from an assumed Kolmogorov spectrum is often sufficient to predict the subfilter scalar variance.
Age-related variation in genetic control of height growth in Douglas-fir.
Namkoong, G; Usanis, R A; Silen, R R
1972-01-01
The development of genetic variances in height growth of Douglas-fir over a 53-year period is analyzed and found to fall into three periods. In the juvenile period, variances in environmental error increase logarithmically, genetic variance within populations exists at moderate levels, and variance among populations is low but increasing. In the early reproductive period, the response to environmental sources of error variance is restricted, genetic variance within populations disappears, and populational differences strongly emerge but do not increase as expected. In the later period, environmental error again increases rapidly, but genetic variance within populations does not reappear and population differences are maintained at about the same level as established in the early reproductive period. The change between the juvenile and early reproductive periods is perhaps associated with the onset of ecological dominance and significant allocations of energy to reproduction.
Evaluating and improving the representation of heteroscedastic errors in hydrological models
NASA Astrophysics Data System (ADS)
McInerney, D. J.; Thyer, M. A.; Kavetski, D.; Kuczera, G. A.
2013-12-01
Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic predictions. In particular, residual errors of hydrological models are often heteroscedastic, with large errors associated with high rainfall and runoff events. Recent studies have shown that using a weighted least squares (WLS) approach - where the magnitude of residuals are assumed to be linearly proportional to the magnitude of the flow - captures some of this heteroscedasticity. In this study we explore a range of Bayesian approaches for improving the representation of heteroscedasticity in residual errors. We compare several improved formulations of the WLS approach, the well-known Box-Cox transformation and the more recent log-sinh transformation. Our results confirm that these approaches are able to stabilize the residual error variance, and that it is possible to improve the representation of heteroscedasticity compared with the linear WLS approach. We also find generally good performance of the Box-Cox and log-sinh transformations, although as indicated in earlier publications, the Box-Cox transform sometimes produces unrealistically large prediction limits. Our work explores the trade-offs between these different uncertainty characterization approaches, investigates how their performance varies across diverse catchments and models, and recommends practical approaches suitable for large-scale applications.
Evaluation of three lidar scanning strategies for turbulence measurements
NASA Astrophysics Data System (ADS)
Newman, J. F.; Klein, P. M.; Wharton, S.; Sathe, A.; Bonin, T. A.; Chilson, P. B.; Muschinski, A.
2015-11-01
Several errors occur when a traditional Doppler-beam swinging (DBS) or velocity-azimuth display (VAD) strategy is used to measure turbulence with a lidar. To mitigate some of these errors, a scanning strategy was recently developed which employs six beam positions to independently estimate the u, v, and w velocity variances and covariances. In order to assess the ability of these different scanning techniques to measure turbulence, a Halo scanning lidar, WindCube v2 pulsed lidar and ZephIR continuous wave lidar were deployed at field sites in Oklahoma and Colorado with collocated sonic anemometers. Results indicate that the six-beam strategy mitigates some of the errors caused by VAD and DBS scans, but the strategy is strongly affected by errors in the variance measured at the different beam positions. The ZephIR and WindCube lidars overestimated horizontal variance values by over 60 % under unstable conditions as a result of variance contamination, where additional variance components contaminate the true value of the variance. A correction method was developed for the WindCube lidar that uses variance calculated from the vertical beam position to reduce variance contamination in the u and v variance components. The correction method reduced WindCube variance estimates by over 20 % at both the Oklahoma and Colorado sites under unstable conditions, when variance contamination is largest. This correction method can be easily applied to other lidars that contain a vertical beam position and is a promising method for accurately estimating turbulence with commercially available lidars.
Evaluation of three lidar scanning strategies for turbulence measurements
NASA Astrophysics Data System (ADS)
Newman, Jennifer F.; Klein, Petra M.; Wharton, Sonia; Sathe, Ameya; Bonin, Timothy A.; Chilson, Phillip B.; Muschinski, Andreas
2016-05-01
Several errors occur when a traditional Doppler beam swinging (DBS) or velocity-azimuth display (VAD) strategy is used to measure turbulence with a lidar. To mitigate some of these errors, a scanning strategy was recently developed which employs six beam positions to independently estimate the u, v, and w velocity variances and covariances. In order to assess the ability of these different scanning techniques to measure turbulence, a Halo scanning lidar, WindCube v2 pulsed lidar, and ZephIR continuous wave lidar were deployed at field sites in Oklahoma and Colorado with collocated sonic anemometers.Results indicate that the six-beam strategy mitigates some of the errors caused by VAD and DBS scans, but the strategy is strongly affected by errors in the variance measured at the different beam positions. The ZephIR and WindCube lidars overestimated horizontal variance values by over 60 % under unstable conditions as a result of variance contamination, where additional variance components contaminate the true value of the variance. A correction method was developed for the WindCube lidar that uses variance calculated from the vertical beam position to reduce variance contamination in the u and v variance components. The correction method reduced WindCube variance estimates by over 20 % at both the Oklahoma and Colorado sites under unstable conditions, when variance contamination is largest. This correction method can be easily applied to other lidars that contain a vertical beam position and is a promising method for accurately estimating turbulence with commercially available lidars.
Shi, Yun; Xu, Peiliang; Peng, Junhuan; Shi, Chuang; Liu, Jingnan
2014-01-01
Modern observation technology has verified that measurement errors can be proportional to the true values of measurements such as GPS, VLBI baselines and LiDAR. Observational models of this type are called multiplicative error models. This paper is to extend the work of Xu and Shimada published in 2000 on multiplicative error models to analytical error analysis of quantities of practical interest and estimates of the variance of unit weight. We analytically derive the variance-covariance matrices of the three least squares (LS) adjustments, the adjusted measurements and the corrections of measurements in multiplicative error models. For quality evaluation, we construct five estimators for the variance of unit weight in association of the three LS adjustment methods. Although LiDAR measurements are contaminated with multiplicative random errors, LiDAR-based digital elevation models (DEM) have been constructed as if they were of additive random errors. We will simulate a model landslide, which is assumed to be surveyed with LiDAR, and investigate the effect of LiDAR-type multiplicative error measurements on DEM construction and its effect on the estimate of landslide mass volume from the constructed DEM. PMID:24434880
Non-additive genetic variation in growth, carcass and fertility traits of beef cattle.
Bolormaa, Sunduimijid; Pryce, Jennie E; Zhang, Yuandan; Reverter, Antonio; Barendse, William; Hayes, Ben J; Goddard, Michael E
2015-04-02
A better understanding of non-additive variance could lead to increased knowledge on the genetic control and physiology of quantitative traits, and to improved prediction of the genetic value and phenotype of individuals. Genome-wide panels of single nucleotide polymorphisms (SNPs) have been mainly used to map additive effects for quantitative traits, but they can also be used to investigate non-additive effects. We estimated dominance and epistatic effects of SNPs on various traits in beef cattle and the variance explained by dominance, and quantified the increase in accuracy of phenotype prediction by including dominance deviations in its estimation. Genotype data (729 068 real or imputed SNPs) and phenotypes on up to 16 traits of 10 191 individuals from Bos taurus, Bos indicus and composite breeds were used. A genome-wide association study was performed by fitting the additive and dominance effects of single SNPs. The dominance variance was estimated by fitting a dominance relationship matrix constructed from the 729 068 SNPs. The accuracy of predicted phenotypic values was evaluated by best linear unbiased prediction using the additive and dominance relationship matrices. Epistatic interactions (additive × additive) were tested between each of the 28 SNPs that are known to have additive effects on multiple traits, and each of the other remaining 729 067 SNPs. The number of significant dominance effects was greater than expected by chance and most of them were in the direction that is presumed to increase fitness and in the opposite direction to inbreeding depression. Estimates of dominance variance explained by SNPs varied widely between traits, but had large standard errors. The median dominance variance across the 16 traits was equal to 5% of the phenotypic variance. Including a dominance deviation in the prediction did not significantly increase its accuracy for any of the phenotypes. The number of additive × additive epistatic effects that were statistically significant was greater than expected by chance. Significant dominance and epistatic effects occur for growth, carcass and fertility traits in beef cattle but they are difficult to estimate precisely and including them in phenotype prediction does not increase its accuracy.
Smooth empirical Bayes estimation of observation error variances in linear systems
NASA Technical Reports Server (NTRS)
Martz, H. F., Jr.; Lian, M. W.
1972-01-01
A smooth empirical Bayes estimator was developed for estimating the unknown random scale component of each of a set of observation error variances. It is shown that the estimator possesses a smaller average squared error loss than other estimators for a discrete time linear system.
Evaluation of three lidar scanning strategies for turbulence measurements
Newman, Jennifer F.; Klein, Petra M.; Wharton, Sonia; ...
2016-05-03
Several errors occur when a traditional Doppler beam swinging (DBS) or velocity–azimuth display (VAD) strategy is used to measure turbulence with a lidar. To mitigate some of these errors, a scanning strategy was recently developed which employs six beam positions to independently estimate the u, v, and w velocity variances and covariances. In order to assess the ability of these different scanning techniques to measure turbulence, a Halo scanning lidar, WindCube v2 pulsed lidar, and ZephIR continuous wave lidar were deployed at field sites in Oklahoma and Colorado with collocated sonic anemometers.Results indicate that the six-beam strategy mitigates some of the errors caused bymore » VAD and DBS scans, but the strategy is strongly affected by errors in the variance measured at the different beam positions. The ZephIR and WindCube lidars overestimated horizontal variance values by over 60 % under unstable conditions as a result of variance contamination, where additional variance components contaminate the true value of the variance. A correction method was developed for the WindCube lidar that uses variance calculated from the vertical beam position to reduce variance contamination in the u and v variance components. The correction method reduced WindCube variance estimates by over 20 % at both the Oklahoma and Colorado sites under unstable conditions, when variance contamination is largest. This correction method can be easily applied to other lidars that contain a vertical beam position and is a promising method for accurately estimating turbulence with commercially available lidars.« less
Evaluation of three lidar scanning strategies for turbulence measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Newman, Jennifer F.; Klein, Petra M.; Wharton, Sonia
Several errors occur when a traditional Doppler beam swinging (DBS) or velocity–azimuth display (VAD) strategy is used to measure turbulence with a lidar. To mitigate some of these errors, a scanning strategy was recently developed which employs six beam positions to independently estimate the u, v, and w velocity variances and covariances. In order to assess the ability of these different scanning techniques to measure turbulence, a Halo scanning lidar, WindCube v2 pulsed lidar, and ZephIR continuous wave lidar were deployed at field sites in Oklahoma and Colorado with collocated sonic anemometers.Results indicate that the six-beam strategy mitigates some of the errors caused bymore » VAD and DBS scans, but the strategy is strongly affected by errors in the variance measured at the different beam positions. The ZephIR and WindCube lidars overestimated horizontal variance values by over 60 % under unstable conditions as a result of variance contamination, where additional variance components contaminate the true value of the variance. A correction method was developed for the WindCube lidar that uses variance calculated from the vertical beam position to reduce variance contamination in the u and v variance components. The correction method reduced WindCube variance estimates by over 20 % at both the Oklahoma and Colorado sites under unstable conditions, when variance contamination is largest. This correction method can be easily applied to other lidars that contain a vertical beam position and is a promising method for accurately estimating turbulence with commercially available lidars.« less
García-González, Miguel A; Fernández-Chimeno, Mireya; Ramos-Castro, Juan
2009-02-01
An analysis of the errors due to the finite resolution of RR time series in the estimation of the approximate entropy (ApEn) is described. The quantification errors in the discrete RR time series produce considerable errors in the ApEn estimation (bias and variance) when the signal variability or the sampling frequency is low. Similar errors can be found in indices related to the quantification of recurrence plots. An easy way to calculate a figure of merit [the signal to resolution of the neighborhood ratio (SRN)] is proposed in order to predict when the bias in the indices could be high. When SRN is close to an integer value n, the bias is higher than when near n - 1/2 or n + 1/2. Moreover, if SRN is close to an integer value, the lower this value, the greater the bias is.
Ménard, Richard; Deshaies-Jacques, Martin; Gasset, Nicolas
2016-09-01
An objective analysis is one of the main components of data assimilation. By combining observations with the output of a predictive model we combine the best features of each source of information: the complete spatial and temporal coverage provided by models, with a close representation of the truth provided by observations. The process of combining observations with a model output is called an analysis. To produce an analysis requires the knowledge of observation and model errors, as well as its spatial correlation. This paper is devoted to the development of methods of estimation of these error variances and the characteristic length-scale of the model error correlation for its operational use in the Canadian objective analysis system. We first argue in favor of using compact support correlation functions, and then introduce three estimation methods: the Hollingsworth-Lönnberg (HL) method in local and global form, the maximum likelihood method (ML), and the [Formula: see text] diagnostic method. We perform one-dimensional (1D) simulation studies where the error variance and true correlation length are known, and perform an estimation of both error variances and correlation length where both are non-uniform. We show that a local version of the HL method can capture accurately the error variances and correlation length at each observation site, provided that spatial variability is not too strong. However, the operational objective analysis requires only a single and globally valid correlation length. We examine whether any statistics of the local HL correlation lengths could be a useful estimate, or whether other global estimation methods such as by the global HL, ML, or [Formula: see text] should be used. We found in both 1D simulation and using real data that the ML method is able to capture physically significant aspects of the correlation length, while most other estimates give unphysical and larger length-scale values. This paper describes a proposed improvement of the objective analysis of surface pollutants at Environment and Climate Change Canada (formerly known as Environment Canada). Objective analyses are essentially surface maps of air pollutants that are obtained by combining observations with an air quality model output, and are thought to provide a complete and more accurate representation of the air quality. The highlight of this study is an analysis of methods to estimate the model (or background) error correlation length-scale. The error statistics are an important and critical component to the analysis scheme.
An approach to the analysis of performance of quasi-optimum digital phase-locked loops.
NASA Technical Reports Server (NTRS)
Polk, D. R.; Gupta, S. C.
1973-01-01
An approach to the analysis of performance of quasi-optimum digital phase-locked loops (DPLL's) is presented. An expression for the characteristic function of the prior error in the state estimate is derived, and from this expression an infinite dimensional equation for the prior error variance is obtained. The prior error-variance equation is a function of the communication system model and the DPLL gain and is independent of the method used to derive the DPLL gain. Two approximations are discussed for reducing the prior error-variance equation to finite dimension. The effectiveness of one approximation in analyzing DPLL performance is studied.
Experimental cosmic statistics - I. Variance
NASA Astrophysics Data System (ADS)
Colombi, Stéphane; Szapudi, István; Jenkins, Adrian; Colberg, Jörg
2000-04-01
Counts-in-cells are measured in the τCDM Virgo Hubble Volume simulation. This large N-body experiment has 109 particles in a cubic box of size 2000h-1Mpc. The unprecedented combination of size and resolution allows, for the first time, a realistic numerical analysis of the cosmic errors and cosmic correlations of statistics related to counts-in-cells measurements, such as the probability distribution function PN itself, its factorial moments Fk and the related cumulants ψ and SNs. These statistics are extracted from the whole simulation cube, as well as from 4096 subcubes of size 125h-1Mpc, each representing a virtual random realization of the local universe. The measurements and their scatter over the subvolumes are compared to the theoretical predictions of Colombi, Bouchet & Schaeffer for P0, and of Szapudi & Colombi and Szapudi, Colombi & Bernardeau for the factorial moments and the cumulants. The general behaviour of experimental variance and cross-correlations as functions of scale and order is well described by theoretical predictions, with a few per cent accuracy in the weakly non-linear regime for the cosmic error on factorial moments. On highly non-linear scales, however, all variants of the hierarchical model used by SC and SCB to describe clustering appear to become increasingly approximate, which leads to a slight overestimation of the error, by about a factor of two in the worst case. Because of the needed supplementary perturbative approach, the theory is less accurate for non-linear estimators, such as cumulants, than for factorial moments. The cosmic bias is evaluated as well, and, in agreement with SCB, is found to be insignificant compared with the cosmic variance in all regimes investigated. While higher order statistics were previously evaluated in several simulations, this work presents textbook quality measurements of SNs, 3<=N<=10, in an unprecedented dynamic range of 0.05 <~ ψ <~ 50. In the weakly non-linear regime the results confirm previous findings and agree remarkably well with perturbation theory predictions including the one-loop corrections based on spherical collapse by Fosalba & Gaztañaga. Extended perturbation theory is confirmed on all scales.
Knopman, Debra S.; Voss, Clifford I.
1987-01-01
The spatial and temporal variability of sensitivities has a significant impact on parameter estimation and sampling design for studies of solute transport in porous media. Physical insight into the behavior of sensitivities is offered through an analysis of analytically derived sensitivities for the one-dimensional form of the advection-dispersion equation. When parameters are estimated in regression models of one-dimensional transport, the spatial and temporal variability in sensitivities influences variance and covariance of parameter estimates. Several principles account for the observed influence of sensitivities on parameter uncertainty. (1) Information about a physical parameter may be most accurately gained at points in space and time with a high sensitivity to the parameter. (2) As the distance of observation points from the upstream boundary increases, maximum sensitivity to velocity during passage of the solute front increases and the consequent estimate of velocity tends to have lower variance. (3) The frequency of sampling must be “in phase” with the S shape of the dispersion sensitivity curve to yield the most information on dispersion. (4) The sensitivity to the dispersion coefficient is usually at least an order of magnitude less than the sensitivity to velocity. (5) The assumed probability distribution of random error in observations of solute concentration determines the form of the sensitivities. (6) If variance in random error in observations is large, trends in sensitivities of observation points may be obscured by noise and thus have limited value in predicting variance in parameter estimates among designs. (7) Designs that minimize the variance of one parameter may not necessarily minimize the variance of other parameters. (8) The time and space interval over which an observation point is sensitive to a given parameter depends on the actual values of the parameters in the underlying physical system.
NASA Technical Reports Server (NTRS)
Deloach, Richard; Obara, Clifford J.; Goodman, Wesley L.
2012-01-01
This paper documents a check standard wind tunnel test conducted in the Langley 0.3-Meter Transonic Cryogenic Tunnel (0.3M TCT) that was designed and analyzed using the Modern Design of Experiments (MDOE). The test designed to partition the unexplained variance of typical wind tunnel data samples into two constituent components, one attributable to ordinary random error, and one attributable to systematic error induced by covariate effects. Covariate effects in wind tunnel testing are discussed, with examples. The impact of systematic (non-random) unexplained variance on the statistical independence of sequential measurements is reviewed. The corresponding correlation among experimental errors is discussed, as is the impact of such correlation on experimental results generally. The specific experiment documented herein was organized as a formal test for the presence of unexplained variance in representative samples of wind tunnel data, in order to quantify the frequency with which such systematic error was detected, and its magnitude relative to ordinary random error. Levels of systematic and random error reported here are representative of those quantified in other facilities, as cited in the references.
Updating the Standard Spatial Observer for Contrast Detection
NASA Technical Reports Server (NTRS)
Ahumada, Albert J.; Watson, Andrew B.
2011-01-01
Watson and Ahmuada (2005) constructed a Standard Spatial Observer (SSO) model for foveal luminance contrast signal detection based on the Medelfest data (Watson, 1999). Here we propose two changes to the model, dropping the oblique effect from the CSF and using the cone density data of Curcio et al. (1990) to estimate the variation of sensitivity with eccentricity. Dropping the complex images, and using medians to exclude outlier data points, the SSO model now accounts for essentially all the predictable variance in the data, with an RMS prediction error of only 0.67 dB.
Statistical modelling of thermal annealing of fission tracks in apatite
NASA Astrophysics Data System (ADS)
Laslett, G. M.; Galbraith, R. F.
1996-12-01
We develop an improved methodology for modelling the relationship between mean track length, temperature, and time in fission track annealing experiments. We consider "fanning Arrhenius" models, in which contours of constant mean length on an Arrhenius plot are straight lines meeting at a common point. Features of our approach are explicit use of subject matter knowledge, treating mean length as the response variable, modelling of the mean-variance relationship with two components of variance, improved modelling of the control sample, and using information from experiments in which no tracks are seen. This approach overcomes several weaknesses in previous models and provides a robust six parameter model that is widely applicable. Estimation is via direct maximum likelihood which can be implemented using a standard numerical optimisation package. Because the model is highly nonlinear, some reparameterisations are needed to achieve stable estimation and calculation of precisions. Experience suggests that precisions are more convincingly estimated from profile log-likelihood functions than from the information matrix. We apply our method to the B-5 and Sr fluorapatite data of Crowley et al. (1991) and obtain well-fitting models in both cases. For the B-5 fluorapatite, our model exhibits less fanning than that of Crowley et al. (1991), although fitted mean values above 12 μm are fairly similar. However, predictions can be different, particularly for heavy annealing at geological time scales, where our model is less retentive. In addition, the refined error structure of our model results in tighter prediction errors, and has components of error that are easier to verify or modify. For the Sr fluorapatite, our fitted model for mean lengths does not differ greatly from that of Crowley et al. (1991), but our error structure is quite different.
Non-Gaussian Distribution of DNA Barcode Extension In Nanochannels Using High-throughput Imaging
NASA Astrophysics Data System (ADS)
Sheats, Julian; Reinhart, Wesley; Reifenberger, Jeff; Gupta, Damini; Muralidhar, Abhiram; Cao, Han; Dorfman, Kevin
2015-03-01
We present experimental data for the extension of internal segments of highly confined DNA using a high-throughput experimental setup. Barcode-labeled E. coli genomic DNA molecules were imaged at a high areal density in square nanochannels with sizes ranging from 40 nm to 51 nm in width. Over 25,000 molecules were used to obtain more than 1,000,000 measurements for genomic distances between 2,500 bp and 100,000 bp. The distribution of extensions has positive excess kurtosis and is skew left due to weak backfolding in the channel. As a result, the two Odijk theories for the chain extension and variance bracket the experimental data. We compared to predictions of a harmonic approximation for the confinement free energy and show that it produces a substantial error in the variance. These results suggest an inherent error associated with any statistical analysis of barcoded DNA that relies on harmonic models for chain extension. Present address: Department of Chemical and Biological Engineering, Princeton University.
Stochastic models for inferring genetic regulation from microarray gene expression data.
Tian, Tianhai
2010-03-01
Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information. 2009 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Thomas, Philipp; Straube, Arthur V.; Grima, Ramon
2011-11-01
It is commonly believed that, whenever timescale separation holds, the predictions of reduced chemical master equations obtained using the stochastic quasi-steady-state approximation are in very good agreement with the predictions of the full master equations. We use the linear noise approximation to obtain a simple formula for the relative error between the predictions of the two master equations for the Michaelis-Menten reaction with substrate input. The reduced approach is predicted to overestimate the variance of the substrate concentration fluctuations by as much as 30%. The theoretical results are validated by stochastic simulations using experimental parameter values for enzymes involved in proteolysis, gluconeogenesis, and fermentation.
Three tests and three corrections: Comment on Koen and Yonelinas (2010)
Jang, Yoonhee; Mickes, Laura; Wixted, John T.
2012-01-01
The slope of the z-transformed receiver-operating characteristic (zROC) in recognition memory experiments is usually less than 1, which has long been interpreted to mean that the variance of the target distribution is greater than the variance of the lure distribution. The greater variance of the target distribution could arise because the different items on a list receive different increments in memory strength during study (the “encoding variability” hypothesis). In a test of that interpretation, J. Koen and A. Yonelinas (2010, K&Y) attempted to further increase encoding variability to see if it would further decrease the slope of the zROC. To do so, they presented items on a list for two different durations and then mixed the weak and strong targets together. After performing three tests on the mixed-strength data, K&Y concluded that encoding variability does not explain why the slope of the zROC is typically less than one. However, we show that their tests have no bearing on the encoding variability account. Instead, they bear on the mixture-UVSD model that corresponds to their experimental design. On the surface, the results reported by K&Y appear to be inconsistent with the predictions of the mixture-UVSD model (though they were taken to be inconsistent with the predictions of the encoding variability hypothesis). However, all three of the tests they performed contained errors. When those errors are corrected, the same three tests show that their data support, rather than contradict, the mixture-UVSD model (but they still have no bearing on the encoding variability hypothesis). PMID:22390323
Nazemi, S Majid; Amini, Morteza; Kontulainen, Saija A; Milner, Jaques S; Holdsworth, David W; Masri, Bassam A; Wilson, David R; Johnston, James D
2017-01-01
Quantitative computed tomography based subject-specific finite element modeling has potential to clarify the role of subchondral bone alterations in knee osteoarthritis initiation, progression, and pain. However, it is unclear what density-modulus equation(s) should be applied with subchondral cortical and subchondral trabecular bone when constructing finite element models of the tibia. Using a novel approach applying neural networks, optimization, and back-calculation against in situ experimental testing results, the objective of this study was to identify subchondral-specific equations that optimized finite element predictions of local structural stiffness at the proximal tibial subchondral surface. Thirteen proximal tibial compartments were imaged via quantitative computed tomography. Imaged bone mineral density was converted to elastic moduli using multiple density-modulus equations (93 total variations) then mapped to corresponding finite element models. For each variation, root mean squared error was calculated between finite element prediction and in situ measured stiffness at 47 indentation sites. Resulting errors were used to train an artificial neural network, which provided an unlimited number of model variations, with corresponding error, for predicting stiffness at the subchondral bone surface. Nelder-Mead optimization was used to identify optimum density-modulus equations for predicting stiffness. Finite element modeling predicted 81% of experimental stiffness variance (with 10.5% error) using optimized equations for subchondral cortical and trabecular bone differentiated with a 0.5g/cm 3 density. In comparison with published density-modulus relationships, optimized equations offered improved predictions of local subchondral structural stiffness. Further research is needed with anisotropy inclusion, a smaller voxel size and de-blurring algorithms to improve predictions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Impact of source collinearity in simulated PM 2.5 data on the PMF receptor model solution
NASA Astrophysics Data System (ADS)
Habre, Rima; Coull, Brent; Koutrakis, Petros
2011-12-01
Positive Matrix Factorization (PMF) is a factor analytic model used to identify particle sources and to estimate their contributions to PM 2.5 concentrations observed at receptor sites. Collinearity in source contributions due to meteorological conditions introduces uncertainty in the PMF solution. We simulated datasets of speciated PM 2.5 concentrations associated with three ambient particle sources: "Motor Vehicle" (MV), "Sodium Chloride" (NaCl), and "Sulfur" (S), and we varied the correlation structure between their mass contributions to simulate collinearity. We analyzed the datasets in PMF using the ME-2 multilinear engine. The Pearson correlation coefficients between the simulated and PMF-predicted source contributions and profiles are denoted by " G correlation" and " F correlation", respectively. In sensitivity analyses, we examined how the means or variances of the source contributions affected the stability of the PMF solution with collinearity. The % errors in predicting the average source contributions were 23, 80 and 23% for MV, NaCl, and S, respectively. On average, the NaCl contribution was overestimated, while MV and S contributions were underestimated. The ability of PMF to predict the contributions and profiles of the three sources deteriorated significantly as collinearity in their contributions increased. When the mean of NaCl or variance of NaCl and MV source contributions was increased, the deterioration in G correlation with increasing collinearity became less significant, and the ability of PMF to predict the NaCl and MV loading profiles improved. When the three factor profiles were simulated to share more elements, the decrease in G and F correlations became non-significant. Our findings agree with previous simulation studies reporting that correlated sources are predicted with higher error and bias. Consequently, the power to detect significant concentration-response estimates in health effect analyses weakens.
Evaluation of TRMM Ground-Validation Radar-Rain Errors Using Rain Gauge Measurements
NASA Technical Reports Server (NTRS)
Wang, Jianxin; Wolff, David B.
2009-01-01
Ground-validation (GV) radar-rain products are often utilized for validation of the Tropical Rainfall Measuring Mission (TRMM) spaced-based rain estimates, and hence, quantitative evaluation of the GV radar-rain product error characteristics is vital. This study uses quality-controlled gauge data to compare with TRMM GV radar rain rates in an effort to provide such error characteristics. The results show that significant differences of concurrent radar-gauge rain rates exist at various time scales ranging from 5 min to 1 day, despite lower overall long-term bias. However, the differences between the radar area-averaged rain rates and gauge point rain rates cannot be explained as due to radar error only. The error variance separation method is adapted to partition the variance of radar-gauge differences into the gauge area-point error variance and radar rain estimation error variance. The results provide relatively reliable quantitative uncertainty evaluation of TRMM GV radar rain estimates at various times scales, and are helpful to better understand the differences between measured radar and gauge rain rates. It is envisaged that this study will contribute to better utilization of GV radar rain products to validate versatile spaced-based rain estimates from TRMM, as well as the proposed Global Precipitation Measurement, and other satellites.
Influence of outliers on accuracy estimation in genomic prediction in plant breeding.
Estaghvirou, Sidi Boubacar Ould; Ogutu, Joseph O; Piepho, Hans-Peter
2014-10-01
Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased. Copyright © 2014 Ould Estaghvirou et al.
Gurdak, Jason J.; Qi, Sharon L.; Geisler, Michael L.
2009-01-01
The U.S. Geological Survey Raster Error Propagation Tool (REPTool) is a custom tool for use with the Environmental System Research Institute (ESRI) ArcGIS Desktop application to estimate error propagation and prediction uncertainty in raster processing operations and geospatial modeling. REPTool is designed to introduce concepts of error and uncertainty in geospatial data and modeling and provide users of ArcGIS Desktop a geoprocessing tool and methodology to consider how error affects geospatial model output. Similar to other geoprocessing tools available in ArcGIS Desktop, REPTool can be run from a dialog window, from the ArcMap command line, or from a Python script. REPTool consists of public-domain, Python-based packages that implement Latin Hypercube Sampling within a probabilistic framework to track error propagation in geospatial models and quantitatively estimate the uncertainty of the model output. Users may specify error for each input raster or model coefficient represented in the geospatial model. The error for the input rasters may be specified as either spatially invariant or spatially variable across the spatial domain. Users may specify model output as a distribution of uncertainty for each raster cell. REPTool uses the Relative Variance Contribution method to quantify the relative error contribution from the two primary components in the geospatial model - errors in the model input data and coefficients of the model variables. REPTool is appropriate for many types of geospatial processing operations, modeling applications, and related research questions, including applications that consider spatially invariant or spatially variable error in geospatial data.
The balanced mind: the variability of task-unrelated thoughts predicts error monitoring
Allen, Micah; Smallwood, Jonathan; Christensen, Joanna; Gramm, Daniel; Rasmussen, Beinta; Jensen, Christian Gaden; Roepstorff, Andreas; Lutz, Antoine
2013-01-01
Self-generated thoughts unrelated to ongoing activities, also known as “mind-wandering,” make up a substantial portion of our daily lives. Reports of such task-unrelated thoughts (TUTs) predict both poor performance on demanding cognitive tasks and blood-oxygen-level-dependent (BOLD) activity in the default mode network (DMN). However, recent findings suggest that TUTs and the DMN can also facilitate metacognitive abilities and related behaviors. To further understand these relationships, we examined the influence of subjective intensity, ruminative quality, and variability of mind-wandering on response inhibition and monitoring, using the Error Awareness Task (EAT). We expected to replicate links between TUT and reduced inhibition, and explored whether variance in TUT would predict improved error monitoring, reflecting a capacity to balance between internal and external cognition. By analyzing BOLD responses to subjective probes and the EAT, we dissociated contributions of the DMN, executive, and salience networks to task performance. While both response inhibition and online TUT ratings modulated BOLD activity in the medial prefrontal cortex (mPFC) of the DMN, the former recruited a more dorsal area implying functional segregation. We further found that individual differences in mean TUTs strongly predicted EAT stop accuracy, while TUT variability specifically predicted levels of error awareness. Interestingly, we also observed co-activation of salience and default mode regions during error awareness, supporting a link between monitoring and TUTs. Altogether our results suggest that although TUT is detrimental to task performance, fluctuations in attention between self-generated and external task-related thought is a characteristic of individuals with greater metacognitive monitoring capacity. Achieving a balance between internally and externally oriented thought may thus aid individuals in optimizing their task performance. PMID:24223545
Hill, Mary C.
2010-01-01
Doherty and Hunt (2009) present important ideas for first-order-second moment sensitivity analysis, but five issues are discussed in this comment. First, considering the composite-scaled sensitivity (CSS) jointly with parameter correlation coefficients (PCC) in a CSS/PCC analysis addresses the difficulties with CSS mentioned in the introduction. Second, their new parameter identifiability statistic actually is likely to do a poor job of parameter identifiability in common situations. The statistic instead performs the very useful role of showing how model parameters are included in the estimated singular value decomposition (SVD) parameters. Its close relation to CSS is shown. Third, the idea from p. 125 that a suitable truncation point for SVD parameters can be identified using the prediction variance is challenged using results from Moore and Doherty (2005). Fourth, the relative error reduction statistic of Doherty and Hunt is shown to belong to an emerging set of statistics here named perturbed calculated variance statistics. Finally, the perturbed calculated variance statistics OPR and PPR mentioned on p. 121 are shown to explicitly include the parameter null-space component of uncertainty. Indeed, OPR and PPR results that account for null-space uncertainty have appeared in the literature since 2000.
Testing physical models for dipolar asymmetry with CMB polarization
NASA Astrophysics Data System (ADS)
Contreras, D.; Zibin, J. P.; Scott, D.; Banday, A. J.; Górski, K. M.
2017-12-01
The cosmic microwave background (CMB) temperature anisotropies exhibit a large-scale dipolar power asymmetry. To determine whether this is due to a real, physical modulation or is simply a large statistical fluctuation requires the measurement of new modes. Here we forecast how well CMB polarization data from Planck and future experiments will be able to confirm or constrain physical models for modulation. Fitting several such models to the Planck temperature data allows us to provide predictions for polarization asymmetry. While for some models and parameters Planck polarization will decrease error bars on the modulation amplitude by only a small percentage, we show, importantly, that cosmic-variance-limited (and in some cases even Planck) polarization data can decrease the errors by considerably better than the expectation of √{2 } based on simple ℓ-space arguments. We project that if the primordial fluctuations are truly modulated (with parameters as indicated by Planck temperature data) then Planck will be able to make a 2 σ detection of the modulation model with 20%-75% probability, increasing to 45%-99% when cosmic-variance-limited polarization is considered. We stress that these results are quite model dependent. Cosmic variance in temperature is important: combining statistically isotropic polarization with temperature data will spuriously increase the significance of the temperature signal with 30% probability for Planck.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
NASA Astrophysics Data System (ADS)
Berthet, Lionel; Marty, Renaud; Bourgin, François; Viatgé, Julie; Piotte, Olivier; Perrin, Charles
2017-04-01
An increasing number of operational flood forecasting centres assess the predictive uncertainty associated with their forecasts and communicate it to the end users. This information can match the end-users needs (i.e. prove to be useful for an efficient crisis management) only if it is reliable: reliability is therefore a key quality for operational flood forecasts. In 2015, the French flood forecasting national and regional services (Vigicrues network; www.vigicrues.gouv.fr) implemented a framework to compute quantitative discharge and water level forecasts and to assess the predictive uncertainty. Among the possible technical options to achieve this goal, a statistical analysis of past forecasting errors of deterministic models has been selected (QUOIQUE method, Bourgin, 2014). It is a data-based and non-parametric approach based on as few assumptions as possible about the forecasting error mathematical structure. In particular, a very simple assumption is made regarding the predictive uncertainty distributions for large events outside the range of the calibration data: the multiplicative error distribution is assumed to be constant, whatever the magnitude of the flood. Indeed, the predictive distributions may not be reliable in extrapolation. However, estimating the predictive uncertainty for these rare events is crucial when major floods are of concern. In order to improve the forecasts reliability for major floods, an attempt at combining the operational strength of the empirical statistical analysis and a simple error modelling is done. Since the heteroscedasticity of forecast errors can considerably weaken the predictive reliability for large floods, this error modelling is based on the log-sinh transformation which proved to reduce significantly the heteroscedasticity of the transformed error in a simulation context, even for flood peaks (Wang et al., 2012). Exploratory tests on some operational forecasts issued during the recent floods experienced in France (major spring floods in June 2016 on the Loire river tributaries and flash floods in fall 2016) will be shown and discussed. References Bourgin, F. (2014). How to assess the predictive uncertainty in hydrological modelling? An exploratory work on a large sample of watersheds, AgroParisTech Wang, Q. J., Shrestha, D. L., Robertson, D. E. and Pokhrel, P (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, , W05514, doi:10.1029/2011WR010973
Mauya, Ernest William; Hansen, Endre Hofstad; Gobakken, Terje; Bollandsås, Ole Martin; Malimbwi, Rogers Ernest; Næsset, Erik
2015-12-01
Airborne laser scanning (ALS) has recently emerged as a promising tool to acquire auxiliary information for improving aboveground biomass (AGB) estimation in sample-based forest inventories. Under design-based and model-assisted inferential frameworks, the estimation relies on a model that relates the auxiliary ALS metrics to AGB estimated on ground plots. The size of the field plots has been identified as one source of model uncertainty because of the so-called boundary effects which increases with decreasing plot size. Recent research in tropical forests has aimed to quantify the boundary effects on model prediction accuracy, but evidence of the consequences for the final AGB estimates is lacking. In this study we analyzed the effect of field plot size on model prediction accuracy and its implication when used in a model-assisted inferential framework. The results showed that the prediction accuracy of the model improved as the plot size increased. The adjusted R 2 increased from 0.35 to 0.74 while the relative root mean square error decreased from 63.6 to 29.2%. Indicators of boundary effects were identified and confirmed to have significant effects on the model residuals. Variance estimates of model-assisted mean AGB relative to corresponding variance estimates of pure field-based AGB, decreased with increasing plot size in the range from 200 to 3000 m 2 . The variance ratio of field-based estimates relative to model-assisted variance ranged from 1.7 to 7.7. This study showed that the relative improvement in precision of AGB estimation when increasing field-plot size, was greater for an ALS-assisted inventory compared to that of a pure field-based inventory.
Genome-Assisted Prediction of Quantitative Traits Using the R Package sommer.
Covarrubias-Pazaran, Giovanny
2016-01-01
Most traits of agronomic importance are quantitative in nature, and genetic markers have been used for decades to dissect such traits. Recently, genomic selection has earned attention as next generation sequencing technologies became feasible for major and minor crops. Mixed models have become a key tool for fitting genomic selection models, but most current genomic selection software can only include a single variance component other than the error, making hybrid prediction using additive, dominance and epistatic effects unfeasible for species displaying heterotic effects. Moreover, Likelihood-based software for fitting mixed models with multiple random effects that allows the user to specify the variance-covariance structure of random effects has not been fully exploited. A new open-source R package called sommer is presented to facilitate the use of mixed models for genomic selection and hybrid prediction purposes using more than one variance component and allowing specification of covariance structures. The use of sommer for genomic prediction is demonstrated through several examples using maize and wheat genotypic and phenotypic data. At its core, the program contains three algorithms for estimating variance components: Average information (AI), Expectation-Maximization (EM) and Efficient Mixed Model Association (EMMA). Kernels for calculating the additive, dominance and epistatic relationship matrices are included, along with other useful functions for genomic analysis. Results from sommer were comparable to other software, but the analysis was faster than Bayesian counterparts in the magnitude of hours to days. In addition, ability to deal with missing data, combined with greater flexibility and speed than other REML-based software was achieved by putting together some of the most efficient algorithms to fit models in a gentle environment such as R.
Doubková, Marcela; Van Dijk, Albert I.J.M.; Sabel, Daniel; Wagner, Wolfgang; Blöschl, Günter
2012-01-01
The Sentinel-1 will carry onboard a C-band radar instrument that will map the European continent once every four days and the global land surface at least once every twelve days with finest 5 × 20 m spatial resolution. The high temporal sampling rate and operational configuration make Sentinel-1 of interest for operational soil moisture monitoring. Currently, updated soil moisture data are made available at 1 km spatial resolution as a demonstration service using Global Mode (GM) measurements from the Advanced Synthetic Aperture Radar (ASAR) onboard ENVISAT. The service demonstrates the potential of the C-band observations to monitor variations in soil moisture. Importantly, a retrieval error estimate is also available; these are needed to assimilate observations into models. The retrieval error is estimated by propagating sensor errors through the retrieval model. In this work, the existing ASAR GM retrieval error product is evaluated using independent top soil moisture estimates produced by the grid-based landscape hydrological model (AWRA-L) developed within the Australian Water Resources Assessment system (AWRA). The ASAR GM retrieval error estimate, an assumed prior AWRA-L error estimate and the variance in the respective datasets were used to spatially predict the root mean square error (RMSE) and the Pearson's correlation coefficient R between the two datasets. These were compared with the RMSE calculated directly from the two datasets. The predicted and computed RMSE showed a very high level of agreement in spatial patterns as well as good quantitative agreement; the RMSE was predicted within accuracy of 4% of saturated soil moisture over 89% of the Australian land mass. Predicted and calculated R maps corresponded within accuracy of 10% over 61% of the continent. The strong correspondence between the predicted and calculated RMSE and R builds confidence in the retrieval error model and derived ASAR GM error estimates. The ASAR GM and Sentinel-1 have the same basic physical measurement characteristics, and therefore very similar retrieval error estimation method can be applied. Because of the expected improvements in radiometric resolution of the Sentinel-1 backscatter measurements, soil moisture estimation errors can be expected to be an order of magnitude less than those for ASAR GM. This opens the possibility for operationally available medium resolution soil moisture estimates with very well-specified errors that can be assimilated into hydrological or crop yield models, with potentially large benefits for land-atmosphere fluxes, crop growth, and water balance monitoring and modelling. PMID:23483015
Gorgey, Ashraf S; Dolbow, David R; Gater, David R
2012-07-01
To establish and validate prediction equations by using body weight to predict legs, trunk, and whole-body fat-free mass (FFM) in men with chronic complete spinal cord injury (SCI). Cross-sectional design. Research setting in a large medical center. Individuals with SCI (N=63) divided into prediction (n=42) and cross-validation (n=21) groups. Not applicable. Whole-body FFM and regional FFM were determined by using dual-energy x-ray absorptiometry. Body weight was measured by using a wheelchair weighing scale after subtracting the weight of the chair. Body weight predicted legs FFM (legs FFM=.09×body weight+6.1; R(2)=.25, standard error of the estimate [SEE]=3.1kg, P<.01), trunk FFM (trunk FFM=.21×body weight+8.6; R(2)=.56, SEE=3.6kg, P<.0001), and whole-body FFM (whole-body FFM=.288×body weight+26.3; R(2)=.53, SEE=5.3kg, P<.0001). The whole-body FFM(predicted) (FFM predicted from the derived equations) shared 86% of the variance in whole-body FFM(measured) (FFM measured using dual-energy x-ray absorptiometry scan) (R(2)=.86, SEE=1.8kg, P<.0001), 69% of trunk FFM(measured), and 66% of legs FFM(measured). The trunk FFM(predicted) shared 69% of the variance in trunk FFM(measured) (R(2)=.69, SEE=2.7kg, P<.0001), and legs FFM(predicted) shared 67% of the variance in legs FFM(measured) (R(2)=.67, SEE=2.8kg, P<.0001). Values of FFM did not differ between the prediction and validation groups. Body weight can be used to predict whole-body FFM and regional FFM. The predicted whole-body FFM improved the prediction of trunk FFM and legs FFM. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Nakling, Jakob; Buhaug, Harald; Backe, Bjorn
2005-10-01
In a large unselected population of normal spontaneous pregnancies, to estimate the biologic variation of the interval from the first day of the last menstrual period to start of pregnancy, and the biologic variation of gestational length to delivery; and to estimate the random error of routine ultrasound assessment of gestational age in mid-second trimester. Cohort study of 11,238 singleton pregnancies, with spontaneous onset of labour and reliable last menstrual period. The day of delivery was predicted with two independent methods: According to the rule of Nägele and based on ultrasound examination in gestational weeks 17-19. For both methods, the mean difference between observed and predicted day of delivery was calculated. The variances of the differences were combined to estimate the variances of the two partitions of pregnancy. The biologic variation of the time from last menstrual period to pregnancy start was estimated to 7.0 days (standard deviation), and the standard deviation of the time to spontaneous delivery was estimated to 12.4 days. The estimate of the standard deviation of the random error of ultrasound assessed foetal age was 5.2 days. Even when the last menstrual period is reliable, the biologic variation of the time from last menstrual period to the real start of pregnancy is substantial, and must be taken into account. Reliable information about the first day of the last menstrual period is not equivalent with reliable information about the start of pregnancy.
Numerical prediction of a draft tube flow taking into account uncertain inlet conditions
NASA Astrophysics Data System (ADS)
Brugiere, O.; Balarac, G.; Corre, C.; Metais, O.; Flores, E.; Pleroy
2012-11-01
The swirling turbulent flow in a hydroturbine draft tube is computed with a non-intrusive uncertainty quantification (UQ) method coupled to Reynolds-Averaged Navier-Stokes (RANS) modelling in order to take into account in the numerical prediction the physical uncertainties existing on the inlet flow conditions. The proposed approach yields not only mean velocity fields to be compared with measured profiles, as is customary in Computational Fluid Dynamics (CFD) practice, but also variance of these quantities from which error bars can be deduced on the computed profiles, thus making more significant the comparison between experiment and computation.
Investigating Predictors of Spelling Ability for Adults with Low Literacy Skills
Talwar, Amani; Cote, Nicole Gilbert; Binder, Katherine S.
2014-01-01
This study examined whether the spelling abilities of adults with low literacy skills could be predicted by their phonological, orthographic, and morphological awareness. Sixty Adult Basic Education (ABE) students completed several literacy tasks. It was predicted that scores on phonological and orthographic tasks would explain variance in spelling scores, whereas scores on morphological tasks may not. Scores on all phonological tasks and on one orthographic task emerged as significant predictors of spelling scores. Additionally, error analyses revealed a limited influence of morphological knowledge in spelling attempts. Implications for ABE instruction are discussed. PMID:25364644
Computational substrates of norms and their violations during social exchange.
Xiang, Ting; Lohrenz, Terry; Montague, P Read
2013-01-16
Social norms in humans constrain individual behaviors to establish shared expectations within a social group. Previous work has probed social norm violations and the feelings that such violations engender; however, a computational rendering of the underlying neural and emotional responses has been lacking. We probed norm violations using a two-party, repeated fairness game (ultimatum game) where proposers offer a split of a monetary resource to a responder who either accepts or rejects the offer. Using a norm-training paradigm where subject groups are preadapted to either high or low offers, we demonstrate that unpredictable shifts in expected offers creates a difference in rejection rates exhibited by the two responder groups for otherwise identical offers. We constructed an ideal observer model that identified neural correlates of norm prediction errors in the ventral striatum and anterior insula, regions that also showed strong responses to variance-prediction errors generated by the same model. Subjective feelings about offers correlated with these norm prediction errors, and the two signals displayed overlapping, but not identical, neural correlates in striatum, insula, and medial orbitofrontal cortex. These results provide evidence for the hypothesis that responses in anterior insula can encode information about social norm violations that correlate with changes in overt behavior (changes in rejection rates). Together, these results demonstrate that the brain regions involved in reward prediction and risk prediction are also recruited in signaling social norm violations.
Computational Substrates of Norms and Their Violations during Social Exchange
Xiang, Ting; Lohrenz, Terry; Montague, P. Read
2013-01-01
Social norms in humans constrain individual behaviors to establish shared expectations within a social group. Previous work has probed social norm violations and the feelings that such violations engender; however, a computational rendering of the underlying neural and emotional responses has been lacking. We probed norm violations using a two-party, repeated fairness game (ultimatum game) where proposers offer a split of a monetary resource to a responder who either accepts or rejects the offer. Using a norm-training paradigm where subject groups are preadapted to either high or low offers, we demonstrate that unpredictable shifts in expected offers creates a difference in rejection rates exhibited by the two responder groups for otherwise identical offers. We constructed an ideal observer model that identified neural correlates of norm prediction errors in the ventral striatum and anterior insula, regions that also showed strong responses to variance-prediction errors generated by the same model. Subjective feelings about offers correlated with these norm prediction errors, and the two signals displayed overlapping, but not identical, neural correlates in striatum, insula, and medial orbitofrontal cortex. These results provide evidence for the hypothesis that responses in anterior insula can encode information about social norm violations that correlate with changes in overt behavior (changes in rejection rates). Together, these results demonstrate that the brain regions involved in reward prediction and risk prediction are also recruited in signaling social norm violations. PMID:23325247
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Lee, Yoojin; Callaghan, Martina F; Nagy, Zoltan
2017-01-01
In magnetic resonance imaging, precise measurements of longitudinal relaxation time ( T 1 ) is crucial to acquire useful information that is applicable to numerous clinical and neuroscience applications. In this work, we investigated the precision of T 1 relaxation time as measured using the variable flip angle method with emphasis on the noise propagated from radiofrequency transmit field ([Formula: see text]) measurements. The analytical solution for T 1 precision was derived by standard error propagation methods incorporating the noise from the three input sources: two spoiled gradient echo (SPGR) images and a [Formula: see text] map. Repeated in vivo experiments were performed to estimate the total variance in T 1 maps and we compared these experimentally obtained values with the theoretical predictions to validate the established theoretical framework. Both the analytical and experimental results showed that variance in the [Formula: see text] map propagated comparable noise levels into the T 1 maps as either of the two SPGR images. Improving precision of the [Formula: see text] measurements significantly reduced the variance in the estimated T 1 map. The variance estimated from the repeatedly measured in vivo T 1 maps agreed well with the theoretically-calculated variance in T 1 estimates, thus validating the analytical framework for realistic in vivo experiments. We concluded that for T 1 mapping experiments, the error propagated from the [Formula: see text] map must be considered. Optimizing the SPGR signals while neglecting to improve the precision of the [Formula: see text] map may result in grossly overestimating the precision of the estimated T 1 values.
Toward Joint Hypothesis-Tests Seismic Event Screening Analysis: Ms|mb and Event Depth
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Dale; Selby, Neil
2012-08-14
Well established theory can be used to combine single-phenomenology hypothesis tests into a multi-phenomenology event screening hypothesis test (Fisher's and Tippett's tests). Commonly used standard error in Ms:mb event screening hypothesis test is not fully consistent with physical basis. Improved standard error - Better agreement with physical basis, and correctly partitions error to include Model Error as a component of variance, correctly reduces station noise variance through network averaging. For 2009 DPRK test - Commonly used standard error 'rejects' H0 even with better scaling slope ({beta} = 1, Selby et al.), improved standard error 'fails to rejects' H0.
Revised techniques for estimating peak discharges from channel width in Montana
Parrett, Charles; Hull, J.A.; Omang, R.J.
1987-01-01
This study was conducted to develop new estimating equations based on channel width and the updated flood frequency curves of previous investigations. Simple regression equations for estimating peak discharges with recurrence intervals of 2, 5, 10 , 25, 50, and 100 years were developed for seven regions in Montana. The standard errors of estimates for the equations that use active channel width as the independent variables ranged from 30% to 87%. The standard errors of estimate for the equations that use bankfull width as the independent variable ranged from 34% to 92%. The smallest standard errors generally occurred in the prediction equations for the 2-yr flood, 5-yr flood, and 10-yr flood, and the largest standard errors occurred in the prediction equations for the 100-yr flood. The equations that use active channel width and the equations that use bankfull width were determined to be about equally reliable in five regions. In the West Region, the equations that use bankfull width were slightly more reliable than those based on active channel width, whereas in the East-Central Region the equations that use active channel width were slightly more reliable than those based on bankfull width. Compared with similar equations previously developed, the standard errors of estimate for the new equations are substantially smaller in three regions and substantially larger in two regions. Limitations on the use of the estimating equations include: (1) The equations are based on stable conditions of channel geometry and prevailing water and sediment discharge; (2) The measurement of channel width requires a site visit, preferably by a person with experience in the method, and involves appreciable measurement errors; (3) Reliability of results from the equations for channel widths beyond the range of definition is unknown. In spite of the limitations, the estimating equations derived in this study are considered to be as reliable as estimating equations based on basin and climatic variables. Because the two types of estimating equations are independent, results from each can be weighted inversely proportional to their variances, and averaged. The weighted average estimate has a variance less than either individual estimate. (Author 's abstract)
Within-Tunnel Variations in Pressure Data for Three Transonic Wind Tunnels
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2014-01-01
This paper compares the results of pressure measurements made on the same test article with the same test matrix in three transonic wind tunnels. A comparison is presented of the unexplained variance associated with polar replicates acquired in each tunnel. The impact of a significance component of systematic (not random) unexplained variance is reviewed, and the results of analyses of variance are presented to assess the degree of significant systematic error in these representative wind tunnel tests. Total uncertainty estimates are reported for 140 samples of pressure data, quantifying the effects of within-polar random errors and between-polar systematic bias errors.
Confidence limits for data mining models of options prices
NASA Astrophysics Data System (ADS)
Healy, J. V.; Dixon, M.; Read, B. J.; Cai, F. F.
2004-12-01
Non-parametric methods such as artificial neural nets can successfully model prices of financial options, out-performing the Black-Scholes analytic model (Eur. Phys. J. B 27 (2002) 219). However, the accuracy of such approaches is usually expressed only by a global fitting/error measure. This paper describes a robust method for determining prediction intervals for models derived by non-linear regression. We have demonstrated it by application to a standard synthetic example (29th Annual Conference of the IEEE Industrial Electronics Society, Special Session on Intelligent Systems, pp. 1926-1931). The method is used here to obtain prediction intervals for option prices using market data for LIFFE “ESX” FTSE 100 index options ( http://www.liffe.com/liffedata/contracts/month_onmonth.xls). We avoid special neural net architectures and use standard regression procedures to determine local error bars. The method is appropriate for target data with non constant variance (or volatility).
Development and Initial Validation of the Multicultural Personality Inventory (MPI).
Ponterotto, Joseph G; Fietzer, Alexander W; Fingerhut, Esther C; Woerner, Scott; Stack, Lauren; Magaldi-Dopman, Danielle; Rust, Jonathan; Nakao, Gen; Tsai, Yu-Ting; Black, Natasha; Alba, Renaldo; Desai, Miraj; Frazier, Chantel; LaRue, Alyse; Liao, Pei-Wen
2014-01-01
Two studies summarize the development and initial validation of the Multicultural Personality Inventory (MPI). In Study 1, the 115-item prototype MPI was administered to 415 university students where exploratory factor analysis resulted in a 70-item, 7-factor model. In Study 2, the 70-item MPI and theoretically related companion instruments were administered to a multisite sample of 576 university students. Confirmatory factory analysis found the 7-factor structure to be a relatively good fit to the data (Comparative Fit Index =.954; root mean square error of approximation =.057), and MPI factors predicted variance in criterion variables above and beyond the variance accounted for by broad personality traits (i.e., Big Five). Study limitations and directions for further validation research are specified.
The predictive consequences of parameterization
NASA Astrophysics Data System (ADS)
White, J.; Hughes, J. D.; Doherty, J. E.
2013-12-01
In numerical groundwater modeling, parameterization is the process of selecting the aspects of a computer model that will be allowed to vary during history matching. This selection process is dependent on professional judgment and is, therefore, inherently subjective. Ideally, a robust parameterization should be commensurate with the spatial and temporal resolution of the model and should include all uncertain aspects of the model. Limited computing resources typically require reducing the number of adjustable parameters so that only a subset of the uncertain model aspects are treated as estimable parameters; the remaining aspects are treated as fixed parameters during history matching. We use linear subspace theory to develop expressions for the predictive error incurred by fixing parameters. The predictive error is comprised of two terms. The first term arises directly from the sensitivity of a prediction to fixed parameters. The second term arises from prediction-sensitive adjustable parameters that are forced to compensate for fixed parameters during history matching. The compensation is accompanied by inappropriate adjustment of otherwise uninformed, null-space parameter components. Unwarranted adjustment of null-space components away from prior maximum likelihood values may produce bias if a prediction is sensitive to those components. The potential for subjective parameterization choices to corrupt predictions is examined using a synthetic model. Several strategies are evaluated, including use of piecewise constant zones, use of pilot points with Tikhonov regularization and use of the Karhunen-Loeve transformation. The best choice of parameterization (as defined by minimum error variance) is strongly dependent on the types of predictions to be made by the model.
Hierarchical Bayesian Model Averaging for Chance Constrained Remediation Designs
NASA Astrophysics Data System (ADS)
Chitsazan, N.; Tsai, F. T.
2012-12-01
Groundwater remediation designs are heavily relying on simulation models which are subjected to various sources of uncertainty in their predictions. To develop a robust remediation design, it is crucial to understand the effect of uncertainty sources. In this research, we introduce a hierarchical Bayesian model averaging (HBMA) framework to segregate and prioritize sources of uncertainty in a multi-layer frame, where each layer targets a source of uncertainty. The HBMA framework provides an insight to uncertainty priorities and propagation. In addition, HBMA allows evaluating model weights in different hierarchy levels and assessing the relative importance of models in each level. To account for uncertainty, we employ a chance constrained (CC) programming for stochastic remediation design. Chance constrained programming was implemented traditionally to account for parameter uncertainty. Recently, many studies suggested that model structure uncertainty is not negligible compared to parameter uncertainty. Using chance constrained programming along with HBMA can provide a rigorous tool for groundwater remediation designs under uncertainty. In this research, the HBMA-CC was applied to a remediation design in a synthetic aquifer. The design was to develop a scavenger well approach to mitigate saltwater intrusion toward production wells. HBMA was employed to assess uncertainties from model structure, parameter estimation and kriging interpolation. An improved harmony search optimization method was used to find the optimal location of the scavenger well. We evaluated prediction variances of chloride concentration at the production wells through the HBMA framework. The results showed that choosing the single best model may lead to a significant error in evaluating prediction variances for two reasons. First, considering the single best model, variances that stem from uncertainty in the model structure will be ignored. Second, considering the best model with non-dominant model weight may underestimate or overestimate prediction variances by ignoring other plausible propositions. Chance constraints allow developing a remediation design with a desirable reliability. However, considering the single best model, the calculated reliability will be different from the desirable reliability. We calculated the reliability of the design for the models at different levels of HBMA. The results showed that by moving toward the top layers of HBMA, the calculated reliability converges to the chosen reliability. We employed the chance constrained optimization along with the HBMA framework to find the optimal location and pumpage for the scavenger well. The results showed that using models at different levels in the HBMA framework, the optimal location of the scavenger well remained the same, but the optimal extraction rate was altered. Thus, we concluded that the optimal pumping rate was sensitive to the prediction variance. Also, the prediction variance was changed by using different extraction rate. Using very high extraction rate will cause prediction variances of chloride concentration at the production wells to approach zero regardless of which HBMA models used.
Dynamic shaping of dopamine signals during probabilistic Pavlovian conditioning.
Hart, Andrew S; Clark, Jeremy J; Phillips, Paul E M
2015-01-01
Cue- and reward-evoked phasic dopamine activity during Pavlovian and operant conditioning paradigms is well correlated with reward-prediction errors from formal reinforcement learning models, which feature teaching signals in the form of discrepancies between actual and expected reward outcomes. Additionally, in learning tasks where conditioned cues probabilistically predict rewards, dopamine neurons show sustained cue-evoked responses that are correlated with the variance of reward and are maximal to cues predicting rewards with a probability of 0.5. Therefore, it has been suggested that sustained dopamine activity after cue presentation encodes the uncertainty of impending reward delivery. In the current study we examined the acquisition and maintenance of these neural correlates using fast-scan cyclic voltammetry in rats implanted with carbon fiber electrodes in the nucleus accumbens core during probabilistic Pavlovian conditioning. The advantage of this technique is that we can sample from the same animal and recording location throughout learning with single trial resolution. We report that dopamine release in the nucleus accumbens core contains correlates of both expected value and variance. A quantitative analysis of these signals throughout learning, and during the ongoing updating process after learning in probabilistic conditions, demonstrates that these correlates are dynamically encoded during these phases. Peak CS-evoked responses are correlated with expected value and predominate during early learning while a variance-correlated sustained CS signal develops during the post-asymptotic updating phase. Copyright © 2014 Elsevier Inc. All rights reserved.
A Study on Mutil-Scale Background Error Covariances in 3D-Var Data Assimilation
NASA Astrophysics Data System (ADS)
Zhang, Xubin; Tan, Zhe-Min
2017-04-01
The construction of background error covariances is a key component of three-dimensional variational data assimilation. There are different scale background errors and interactions among them in the numerical weather Prediction. However, the influence of these errors and their interactions cannot be represented in the background error covariances statistics when estimated by the leading methods. So, it is necessary to construct background error covariances influenced by multi-scale interactions among errors. With the NMC method, this article firstly estimates the background error covariances at given model-resolution scales. And then the information of errors whose scales are larger and smaller than the given ones is introduced respectively, using different nesting techniques, to estimate the corresponding covariances. The comparisons of three background error covariances statistics influenced by information of errors at different scales reveal that, the background error variances enhance particularly at large scales and higher levels when introducing the information of larger-scale errors by the lateral boundary condition provided by a lower-resolution model. On the other hand, the variances reduce at medium scales at the higher levels, while those show slight improvement at lower levels in the nested domain, especially at medium and small scales, when introducing the information of smaller-scale errors by nesting a higher-resolution model. In addition, the introduction of information of larger- (smaller-) scale errors leads to larger (smaller) horizontal and vertical correlation scales of background errors. Considering the multivariate correlations, the Ekman coupling increases (decreases) with the information of larger- (smaller-) scale errors included, whereas the geostrophic coupling in free atmosphere weakens in both situations. The three covariances obtained in above work are used in a data assimilation and model forecast system respectively, and then the analysis-forecast cycles for a period of 1 month are conducted. Through the comparison of both analyses and forecasts from this system, it is found that the trends for variation in analysis increments with information of different scale errors introduced are consistent with those for variation in variances and correlations of background errors. In particular, introduction of smaller-scale errors leads to larger amplitude of analysis increments for winds at medium scales at the height of both high- and low- level jet. And analysis increments for both temperature and humidity are greater at the corresponding scales at middle and upper levels under this circumstance. These analysis increments improve the intensity of jet-convection system which includes jets at different levels and coupling between them associated with latent heat release, and these changes in analyses contribute to the better forecasts for winds and temperature in the corresponding areas. When smaller-scale errors are included, analysis increments for humidity enhance significantly at large scales at lower levels to moisten southern analyses. This humidification devotes to correcting dry bias there and eventually improves forecast skill of humidity. Moreover, inclusion of larger- (smaller-) scale errors is beneficial for forecast quality of heavy (light) precipitation at large (small) scales due to the amplification (diminution) of intensity and area in precipitation forecasts but tends to overestimate (underestimate) light (heavy) precipitation .
Determining Optimal Location and Numbers of Sample Transects for Characterization of UXO Sites
DOE Office of Scientific and Technical Information (OSTI.GOV)
BILISOLY, ROGER L.; MCKENNA, SEAN A.
2003-01-01
Previous work on sample design has been focused on constructing designs for samples taken at point locations. Significantly less work has been done on sample design for data collected along transects. A review of approaches to point and transect sampling design shows that transects can be considered as a sequential set of point samples. Any two sampling designs can be compared through using each one to predict the value of the quantity being measured on a fixed reference grid. The quality of a design is quantified in two ways: computing either the sum or the product of the eigenvalues ofmore » the variance matrix of the prediction error. An important aspect of this analysis is that the reduction of the mean prediction error variance (MPEV) can be calculated for any proposed sample design, including one with straight and/or meandering transects, prior to taking those samples. This reduction in variance can be used as a ''stopping rule'' to determine when enough transect sampling has been completed on the site. Two approaches for the optimization of the transect locations are presented. The first minimizes the sum of the eigenvalues of the predictive error, and the second minimizes the product of these eigenvalues. Simulated annealing is used to identify transect locations that meet either of these objectives. This algorithm is applied to a hypothetical site to determine the optimal locations of two iterations of meandering transects given a previously existing straight transect. The MPEV calculation is also used on both a hypothetical site and on data collected at the Isleta Pueblo to evaluate its potential as a stopping rule. Results show that three or four rounds of systematic sampling with straight parallel transects covering 30 percent or less of the site, can reduce the initial MPEV by as much as 90 percent. The amount of reduction in MPEV can be used as a stopping rule, but the relationship between MPEV and the results of excavation versus no-further-action decisions is site specific and cannot be calculated prior to the sampling. It may be advantageous to use the reduction in MPEV as a stopping rule for systematic sampling across the site that can then be followed by focused sampling in areas identified has having UXO during the systematic sampling. The techniques presented here provide answers to the questions of ''Where to sample?'' and ''When to stop?'' and are capable of running in near real time to support iterative site characterization campaigns.« less
Optical PAyload for Lasercomm Science (OPALS) link validation
NASA Technical Reports Server (NTRS)
Biswas, Abhijit; Oaida, Bogdan V.; Andrews, Kenneth S.; Kovalik, Joseph M.; Abrahamson, Matthew J.; Wright, Malcolm W.
2015-01-01
Recently several day and nighttime links under diverse atmospheric conditions were completed using the Optical Payload for Lasercomm Science (OPALS) flight system on-board the International Space Station (ISS). In this paper we compare measured optical power and its variance at either end of the link with predictions that include atmospheric propagation models. For the 976 nm laser beacon mean power transmitted from the ground to the ISS the predicted mean irradiance of 10's of microwatts per square meter close to zenith and its decrease with range and increased air mass shows good agreement with predictions. The irradiance fluctuations sampled at 100 Hz also follow the expected increase in scintillation with air mass representative of atmospheric coherence lengths at zenith at 500 nm in the 3-8 cm range. The downlink predicted power of 100's of nanowatts was also reconciled within the uncertainty of the atmospheric losses. Expected link performance with uncoded bit-error rates less than 1E-4 required for the Reed-Solomon code to correct errors for video, text and file transmission was verified. The results of predicted and measured powers and fluctuations suggest the need for further study and refinement.
Ben Natan, Merav; Sharon, Ira; Mahajna, Marlen; Mahajna, Sara
2017-11-01
Medication errors are common among nursing students. Nonetheless, these errors are often underreported. To examine factors related to nursing students' intention to report medication errors, using the Theory of Planned Behavior, and to examine whether the theory is useful in predicting students' intention to report errors. This study has a descriptive cross-sectional design. Study population was recruited in a university and a large nursing school in central and northern Israel. A convenience sample of 250 nursing students took part in the study. The students completed a self-report questionnaire, based on the Theory of Planned Behavior. The findings indicate that students' intention to report medication errors was high. The Theory of Planned Behavior constructs explained 38% of variance in students' intention to report medication errors. The constructs of behavioral beliefs, subjective norms, and perceived behavioral control were found as affecting this intention, while the most significant factor was behavioral beliefs. The findings also reveal that students' fear of the reaction to disclosure of the error from superiors and colleagues may impede them from reporting the error. Understanding factors related to reporting medication errors is crucial to designing interventions that foster error reporting. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kang, Le; Chen, Weijie; Petrick, Nicholas A.; Gallas, Brandon D.
2014-01-01
The area under the receiver operating characteristic (ROC) curve (AUC) is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of AUC, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. PMID:25399736
Gutreuter, S.; Boogaard, M.A.
2007-01-01
Predictors of the percentile lethal/effective concentration/dose are commonly used measures of efficacy and toxicity. Typically such quantal-response predictors (e.g., the exposure required to kill 50% of some population) are estimated from simple bioassays wherein organisms are exposed to a gradient of several concentrations of a single agent. The toxicity of an agent may be influenced by auxiliary covariates, however, and more complicated experimental designs may introduce multiple variance components. Prediction methods lag examples of those cases. A conventional two-stage approach consists of multiple bivariate predictions of, say, medial lethal concentration followed by regression of those predictions on the auxiliary covariates. We propose a more effective and parsimonious class of generalized nonlinear mixed-effects models for prediction of lethal/effective dose/concentration from auxiliary covariates. We demonstrate examples using data from a study regarding the effects of pH and additions of variable quantities 2???,5???-dichloro-4???- nitrosalicylanilide (niclosamide) on the toxicity of 3-trifluoromethyl-4- nitrophenol to larval sea lamprey (Petromyzon marinus). The new models yielded unbiased predictions and root-mean-squared errors (RMSEs) of prediction for the exposure required to kill 50 and 99.9% of some population that were 29 to 82% smaller, respectively, than those from the conventional two-stage procedure. The model class is flexible and easily implemented using commonly available software. ?? 2007 SETAC.
Ngendahimana, David K.; Fagerholm, Cara L.; Sun, Jiayang; Bruckman, Laura S.
2017-01-01
Accelerated weathering exposures were performed on poly(ethylene-terephthalate) (PET) films. Longitudinal multi-level predictive models as a function of PET grades and exposure types were developed for the change in yellowness index (YI) and haze (%). Exposures with similar change in YI were modeled using a linear fixed-effects modeling approach. Due to the complex nature of haze formation, measurement uncertainty, and the differences in the samples’ responses, the change in haze (%) depended on individual samples’ responses and a linear mixed-effects modeling approach was used. When compared to fixed-effects models, the addition of random effects in the haze formation models significantly increased the variance explained. For both modeling approaches, diagnostic plots confirmed independence and homogeneity with normally distributed residual errors. Predictive R2 values for true prediction error and predictive power of the models demonstrated that the models were not subject to over-fitting. These models enable prediction under pre-defined exposure conditions for a given exposure time (or photo-dosage in case of UV light exposure). PET degradation under cyclic exposures combining UV light and condensing humidity is caused by photolytic and hydrolytic mechanisms causing yellowing and haze formation. Quantitative knowledge of these degradation pathways enable cross-correlation of these lab-based exposures with real-world conditions for service life prediction. PMID:28498875
Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien
2013-01-01
An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED BY COMPARING THEIR PREDICTION RESULTS WITH ACTUAL DATA: mean absolute percentage error (MAPE), root mean square error (RMSE), standard error of prediction percentage (SEP), bias factor (Bf), accuracy factor (Af), and absolute fraction of variance (R (2)). Graphical plots were also used for model comparison. The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions.
Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien
2013-01-01
Background An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. Methods The ANFIS and ANN models were compared in terms of six statistical indices calculated by comparing their prediction results with actual data: mean absolute percentage error (MAPE), root mean square error (RMSE), standard error of prediction percentage (SEP), bias factor (Bf), accuracy factor (Af), and absolute fraction of variance (R 2). Graphical plots were also used for model comparison. Conclusions The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. PMID:23705023
Evaluation of wave runup predictions from numerical and parametric models
Stockdon, Hilary F.; Thompson, David M.; Plant, Nathaniel G.; Long, Joseph W.
2014-01-01
Wave runup during storms is a primary driver of coastal evolution, including shoreline and dune erosion and barrier island overwash. Runup and its components, setup and swash, can be predicted from a parameterized model that was developed by comparing runup observations to offshore wave height, wave period, and local beach slope. Because observations during extreme storms are often unavailable, a numerical model is used to simulate the storm-driven runup to compare to the parameterized model and then develop an approach to improve the accuracy of the parameterization. Numerically simulated and parameterized runup were compared to observations to evaluate model accuracies. The analysis demonstrated that setup was accurately predicted by both the parameterized model and numerical simulations. Infragravity swash heights were most accurately predicted by the parameterized model. The numerical model suffered from bias and gain errors that depended on whether a one-dimensional or two-dimensional spatial domain was used. Nonetheless, all of the predictions were significantly correlated to the observations, implying that the systematic errors can be corrected. The numerical simulations did not resolve the incident-band swash motions, as expected, and the parameterized model performed best at predicting incident-band swash heights. An assimilated prediction using a weighted average of the parameterized model and the numerical simulations resulted in a reduction in prediction error variance. Finally, the numerical simulations were extended to include storm conditions that have not been previously observed. These results indicated that the parameterized predictions of setup may need modification for extreme conditions; numerical simulations can be used to extend the validity of the parameterized predictions of infragravity swash; and numerical simulations systematically underpredict incident swash, which is relatively unimportant under extreme conditions.
Geostatistical modeling of riparian forest microclimate and its implications for sampling
Eskelson, B.N.I.; Anderson, P.D.; Hagar, J.C.; Temesgen, H.
2011-01-01
Predictive models of microclimate under various site conditions in forested headwater stream - riparian areas are poorly developed, and sampling designs for characterizing underlying riparian microclimate gradients are sparse. We used riparian microclimate data collected at eight headwater streams in the Oregon Coast Range to compare ordinary kriging (OK), universal kriging (UK), and kriging with external drift (KED) for point prediction of mean maximum air temperature (Tair). Several topographic and forest structure characteristics were considered as site-specific parameters. Height above stream and distance to stream were the most important covariates in the KED models, which outperformed OK and UK in terms of root mean square error. Sample patterns were optimized based on the kriging variance and the weighted means of shortest distance criterion using the simulated annealing algorithm. The optimized sample patterns outperformed systematic sample patterns in terms of mean kriging variance mainly for small sample sizes. These findings suggest methods for increasing efficiency of microclimate monitoring in riparian areas.
NASA Astrophysics Data System (ADS)
Correia, Carlos M.; Bond, Charlotte Z.; Sauvage, Jean-François; Fusco, Thierry; Conan, Rodolphe; Wizinowich, Peter L.
2017-10-01
We build on a long-standing tradition in astronomical adaptive optics (AO) of specifying performance metrics and error budgets using linear systems modeling in the spatial-frequency domain. Our goal is to provide a comprehensive tool for the calculation of error budgets in terms of residual temporally filtered phase power spectral densities and variances. In addition, the fast simulation of AO-corrected point spread functions (PSFs) provided by this method can be used as inputs for simulations of science observations with next-generation instruments and telescopes, in particular to predict post-coronagraphic contrast improvements for planet finder systems. We extend the previous results and propose the synthesis of a distributed Kalman filter to mitigate both aniso-servo-lag and aliasing errors whilst minimizing the overall residual variance. We discuss applications to (i) analytic AO-corrected PSF modeling in the spatial-frequency domain, (ii) post-coronagraphic contrast enhancement, (iii) filter optimization for real-time wavefront reconstruction, and (iv) PSF reconstruction from system telemetry. Under perfect knowledge of wind velocities, we show that $\\sim$60 nm rms error reduction can be achieved with the distributed Kalman filter embodying anti- aliasing reconstructors on 10 m class high-order AO systems, leading to contrast improvement factors of up to three orders of magnitude at few ${\\lambda}/D$ separations ($\\sim1-5{\\lambda}/D$) for a 0 magnitude star and reaching close to one order of magnitude for a 12 magnitude star.
Reliability of a Longitudinal Sequence of Scale Ratings
ERIC Educational Resources Information Center
Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony
2009-01-01
Reliability captures the influence of error on a measurement and, in the classical setting, is defined as one minus the ratio of the error variance to the total variance. Laenen, Alonso, and Molenberghs ("Psychometrika" 73:443-448, 2007) proposed an axiomatic definition of reliability and introduced the R[subscript T] coefficient, a measure of…
Impact of Measurement Error on Statistical Power: Review of an Old Paradox.
ERIC Educational Resources Information Center
Williams, Richard H.; And Others
1995-01-01
The paradox that a Student t-test based on pretest-posttest differences can attain its greatest power when the difference score reliability is zero was explained by demonstrating that power is not a mathematical function of reliability unless either true score variance or error score variance is constant. (SLD)
Grogger, P; Sacher, C; Weber, S; Millesi, G; Seemann, R
2018-04-10
Deviations in measuring dentofacial components in a lateral X-ray represent a major hurdle in the subsequent treatment of dysgnathic patients. In a retrospective study, we investigated the most prevalent source of error in the following commonly used cephalometric measurements: the angles Sella-Nasion-Point A (SNA), Sella-Nasion-Point B (SNB) and Point A-Nasion-Point B (ANB); the Wits appraisal; the anteroposterior dysplasia indicator (APDI); and the overbite depth indicator (ODI). Preoperative lateral radiographic images of patients with dentofacial deformities were collected and the landmarks digitally traced by three independent raters. Cephalometric analysis was automatically performed based on 1116 tracings. Error analysis identified the x-coordinate of Point A as the prevalent source of error in all investigated measurements, except SNB, in which it is not incorporated. In SNB, the y-coordinate of Nasion predominated error variance. SNB showed lowest inter-rater variation. In addition, our observations confirmed previous studies showing that landmark identification variance follows characteristic error envelopes in the highest number of tracings analysed up to now. Variance orthogonal to defining planes was of relevance, while variance parallel to planes was not. Taking these findings into account, orthognathic surgeons as well as orthodontists would be able to perform cephalometry more accurately and accomplish better therapeutic results. Copyright © 2018 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Mapping from disease-specific measures to health-state utility values in individuals with migraine.
Gillard, Patrick J; Devine, Beth; Varon, Sepideh F; Liu, Lei; Sullivan, Sean D
2012-05-01
The objective of this study was to develop empirical algorithms that estimate health-state utility values from disease-specific quality-of-life scores in individuals with migraine. Data from a cross-sectional, multicountry study were used. Individuals with episodic and chronic migraine were randomly assigned to training or validation samples. Spearman's correlation coefficients between paired EuroQol five-dimensional (EQ-5D) questionnaire utility values and both Headache Impact Test (HIT-6) scores and Migraine-Specific Quality-of-Life Questionnaire version 2.1 (MSQ) domain scores (role restrictive, role preventive, and emotional function) were examined. Regression models were constructed to estimate EQ-5D questionnaire utility values from the HIT-6 score or the MSQ domain scores. Preferred algorithms were confirmed in the validation samples. In episodic migraine, the preferred HIT-6 and MSQ algorithms explained 22% and 25% of the variance (R(2)) in the training samples, respectively, and had similar prediction errors (root mean square errors of 0.30). In chronic migraine, the preferred HIT-6 and MSQ algorithms explained 36% and 45% of the variance in the training samples, respectively, and had similar prediction errors (root mean square errors 0.31 and 0.29). In episodic and chronic migraine, no statistically significant differences were observed between the mean observed and the mean estimated EQ-5D questionnaire utility values for the preferred HIT-6 and MSQ algorithms in the validation samples. The relationship between the EQ-5D questionnaire and the HIT-6 or the MSQ is adequate to use regression equations to estimate EQ-5D questionnaire utility values. The preferred HIT-6 and MSQ algorithms will be useful in estimating health-state utilities in migraine trials in which no preference-based measure is present. Copyright © 2012 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Xu, Chonggang; Gertner, George
2013-01-01
Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. PMID:24143037
Xu, Chonggang; Gertner, George
2011-01-01
Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements.
Hinton-Bayre, Anton D
2011-02-01
There is an ongoing debate over the preferred method(s) for determining the reliable change (RC) in individual scores over time. In the present paper, specificity comparisons of several classic and contemporary RC models were made using a real data set. This included a more detailed review of a new RC model recently proposed in this journal, that used the within-subjects standard deviation (WSD) as the error term. It was suggested that the RC(WSD) was more sensitive to change and theoretically superior. The current paper demonstrated that even in the presence of mean practice effects, false-positive rates were comparable across models when reliability was good and initial and retest variances were equivalent. However, when variances differed, discrepancies in classification across models became evident. Notably, the RC using the WSD provided unacceptably high false-positive rates in this setting. It was considered that the WSD was never intended for measuring change in this manner. The WSD actually combines systematic and error variance. The systematic variance comes from measurable between-treatment differences, commonly referred to as practice effect. It was further demonstrated that removal of the systematic variance and appropriate modification of the residual error term for the purpose of testing individual change yielded an error term already published and criticized in the literature. A consensus on the RC approach is needed. To that end, further comparison of models under varied conditions is encouraged.
Sensitivity study on durability variables of marine concrete structures
NASA Astrophysics Data System (ADS)
Zhou, Xin'gang; Li, Kefei
2013-06-01
In order to study the influence of parameters on durability of marine concrete structures, the parameter's sensitivity analysis was studied in this paper. With the Fick's 2nd law of diffusion and the deterministic sensitivity analysis method (DSA), the sensitivity factors of apparent surface chloride content, apparent chloride diffusion coefficient and its time dependent attenuation factor were analyzed. The results of the analysis show that the impact of design variables on concrete durability was different. The values of sensitivity factor of chloride diffusion coefficient and its time dependent attenuation factor were higher than others. Relative less error in chloride diffusion coefficient and its time dependent attenuation coefficient induces a bigger error in concrete durability design and life prediction. According to probability sensitivity analysis (PSA), the influence of mean value and variance of concrete durability design variables on the durability failure probability was studied. The results of the study provide quantitative measures of the importance of concrete durability design and life prediction variables. It was concluded that the chloride diffusion coefficient and its time dependent attenuation factor have more influence on the reliability of marine concrete structural durability. In durability design and life prediction of marine concrete structures, it was very important to reduce the measure and statistic error of durability design variables.
Model determination in a case of heterogeneity of variance using sampling techniques.
Varona, L; Moreno, C; Garcia-Cortes, L A; Altarriba, J
1997-01-12
A sampling determination procedure has been described in a case of heterogeneity of variance. The procedure makes use of the predictive distributions of each data given the rest of the data and the structure of the assumed model. The computation of these predictive distributions is carried out using a Gibbs Sampling procedure. The final criterion to compare between models is the Mean Square Error between the expectation of predictive distributions and real data. The procedure has been applied to a data set of weight at 210 days in the Spanish Pirenaica beef cattle breed. Three proposed models have been compared: (a) Single Trait Animal Model; (b) Heterogeneous Variance Animal Model; and (c) Multiple Trait Animal Model. After applying the procedure, the most adjusted model was the Heterogeneous Variance Animal Model. This result is probably due to a compromise between the complexity of the model and the amount of available information. The estimated heritabilities under the preferred model have been 0.489 ± 0.076 for males and 0.331 ± 0.082 for females. RESUMEN: Contraste de modelos en un caso de heterogeneidad de varianzas usando métodos de muestreo Se ha descrito un método de contraste de modelos mediante técnicas de muestreo en un caso de heterogeneidad de varianza entre sexos. El procedimiento utiliza las distribucviones predictivas de cada dato, dado el resto de datos y la estructura del modelo. El criterio para coparar modelos es el error cuadrático medio entre la esperanza de las distribuciones predictivas y los datos reales. El procedimiento se ha aplicado en datos de peso a los 210 días en la raza bovina Pirenaica. Se han propuesto tres posibles modelos: (a) Modelo Animal Unicaracter; (b) Modelo Animal con Varianzas Heterogéneas; (c) Modelo Animal Multicaracter. El modelo mejor ajustado fue el Modelo Animal con Varianzas Heterogéneas. Este resultado es probablemente debido a un compromiso entre la complejidad del modelo y la cantidad de datos disponibles. Las heredabilidades estimadas bajo el modelo preferido han sido 0,489 ± 0,076 en los machos y 0,331 ± 0,082 en las hembras. 1997 Blackwell Verlag GmbH.
Statistics of the radiated field of a space-to-earth microwave power transfer system
NASA Technical Reports Server (NTRS)
Stevens, G. H.; Leininger, G.
1976-01-01
Statistics such as average power density pattern, variance of the power density pattern and variance of the beam pointing error are related to hardware parameters such as transmitter rms phase error and rms amplitude error. Also a limitation on spectral width of the phase reference for phase control was established. A 1 km diameter transmitter appears feasible provided the total rms insertion phase errors of the phase control modules does not exceed 10 deg, amplitude errors do not exceed 10% rms, and the phase reference spectral width does not exceed approximately 3 kHz. With these conditions the expected radiation pattern is virtually the same as the error free pattern, and the rms beam pointing error would be insignificant (approximately 10 meters).
The effect of deviance predictability on mismatch negativity in schizophrenia patients.
Horacek, Magdalena; Kärgel, Christian; Scherbaum, Norbert; Müller, Bernhard W
2016-03-23
Mismatch negativity (MMN) is an electrophysiological index of prediction error processing and recently has been considered an endophenotype marker in schizophrenia. While the prediction error is a core concept in the MMN generation, predictability of deviance occurrence has rarely been assessed in MMN research and in schizophrenia patients. We investigated the MMN to 12% temporally predictable or unpredictable duration decrement deviant stimuli in two runs in 29 healthy controls and 31 schizophrenia patients. We analyzed MMN amplitudes and latencies and its associations with clinical symptoms at electrode Fz. With a stimulus onset asynchronicity of 500 ms in the regular predictable condition, a deviant occurred every 4s while it varied randomly in the unpredictable condition. In the random condition we found diminished MMN amplitudes in patients which normalized in the regular deviance condition, resulting in an analysis of variance main effect of predictability and a predictability x group interaction. Deviance predictability did not affect the MMN of control subjects and we found no relevant results with regard to MMN latencies. Our results indicate that MMN amplitudes in patients normalize to the level of the control subjects in the case of a temporally fixed regular deviant. In schizophrenia patients the detection of deviance is basically intact. However, the temporal uncertainty of deviance occurrence may be of substantial relevance to the highly replicated MMN deficit in schizophrenia patients. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Variance analysis of forecasted streamflow maxima in a wet temperate climate
NASA Astrophysics Data System (ADS)
Al Aamery, Nabil; Fox, James F.; Snyder, Mark; Chandramouli, Chandra V.
2018-05-01
Coupling global climate models, hydrologic models and extreme value analysis provides a method to forecast streamflow maxima, however the elusive variance structure of the results hinders confidence in application. Directly correcting the bias of forecasts using the relative change between forecast and control simulations has been shown to marginalize hydrologic uncertainty, reduce model bias, and remove systematic variance when predicting mean monthly and mean annual streamflow, prompting our investigation for maxima streamflow. We assess the variance structure of streamflow maxima using realizations of emission scenario, global climate model type and project phase, downscaling methods, bias correction, extreme value methods, and hydrologic model inputs and parameterization. Results show that the relative change of streamflow maxima was not dependent on systematic variance from the annual maxima versus peak over threshold method applied, albeit we stress that researchers strictly adhere to rules from extreme value theory when applying the peak over threshold method. Regardless of which method is applied, extreme value model fitting does add variance to the projection, and the variance is an increasing function of the return period. Unlike the relative change of mean streamflow, results show that the variance of the maxima's relative change was dependent on all climate model factors tested as well as hydrologic model inputs and calibration. Ensemble projections forecast an increase of streamflow maxima for 2050 with pronounced forecast standard error, including an increase of +30(±21), +38(±34) and +51(±85)% for 2, 20 and 100 year streamflow events for the wet temperate region studied. The variance of maxima projections was dominated by climate model factors and extreme value analyses.
RFI in hybrid loops - Simulation and experimental results.
NASA Technical Reports Server (NTRS)
Ziemer, R. E.; Nelson, D. R.; Raghavan, H. R.
1972-01-01
A digital simulation of an imperfect second-order hybrid phase-locked loop (HPLL) operating in radio frequency interference (RFI) is described. Its performance is characterized in terms of phase error variance and phase error probability density function (PDF). Monte-Carlo simulation is used to show that the HPLL can be superior to the conventional phase-locked loops in RFI backgrounds when minimum phase error variance is the goodness criterion. Similar experimentally obtained data are given in support of the simulation data.
Standard errors in forest area
Joseph McCollum
2002-01-01
I trace the development of standard error equations for forest area, beginning with the theory behind double sampling and the variance of a product. The discussion shifts to the particular problem of forest area - at which time the theory becomes relevant. There are subtle difficulties in figuring out which variance of a product equation should be used. The equations...
Evaluation of JGM 2 geopotential errors from geosat, TOPEX/poseidon and ERS-1 crossover altimetry
NASA Astrophysics Data System (ADS)
Wagner, C. A.; Klokocník, J.; Tai, C. K.
1995-08-01
World-ocean distribution of the crossover altimetry data from Geosat, TOPEX/Poseidon (T/P) and the ERS 1 missions have provided strong independent evidence that NASA's/CSR's JGM 2 geopotential model (70 x 70 in spherical harmonics) yields accurate radial ephemerides for these satellites. In testing the sea height crossover differences found from altimetry and JGM 2 orbits for these satellites, we have used the sea height differences themselves (of ascending minus descending passes averaged at each location over many exact repeat cycles) and the Lumped Latitude Coefficients (LLC) derived from them. For Geosat we find the geopotential-induced LLC errors (exclusive of non-gravitational and initial state discrepancies) mostly below 6 cm, for TOPEX the corresponding errors are usually below 2 cm, and for ERS 1 (35-day cycle) they are generally belo2 5 cm. In addition, we have found that these observations agree well overall with predictions of accuracy derived from the JGM 2 variance-covariance matrix; the corresponding projected LLC errors for Geosat, T/P, and ERS 1 are usually between 1 and 4 cm, 1 - 2 cm, and 1 - 4 cm, respectively (they depend on the filtering of long-periodic perturbations and on the order of the LLC). This agreement is especially impressive for ERS 1 since no data of any kind from this mission was used in forming JGM 2. The observed crossover differences for Geosat, T/P and ERS 1 are 8, 3, and 11 cm (rms), respectively. These observations also agree well with prediction of accuracy derived from the JGM 2 variance-covariance matrix; the corresponding projected crossover errors for Geosat and T/P are 8 cm and 2.3 cm, respectively. The precision of our mean difference observations is about 3 cm for Geosat (approx. 24,000 observations), 1.5 cm for T/P (approx. 6,000 observations) and 5 cm for ERS 1 (approx. 44,000 observations). Thus, these ``global'' independent data should provide a valuable new source for improving geopotential models. Our results show the need for further correction of the low order JGM 2 geopotential as well as certain resonant orders for all 3 satellites.
An analytic technique for statistically modeling random atomic clock errors in estimation
NASA Technical Reports Server (NTRS)
Fell, P. J.
1981-01-01
Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.
Errors in radial velocity variance from Doppler wind lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, H.; Barthelmie, R. J.; Doubrawa, P.
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
Errors in radial velocity variance from Doppler wind lidar
Wang, H.; Barthelmie, R. J.; Doubrawa, P.; ...
2016-08-29
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
NASA Astrophysics Data System (ADS)
Gill, Jatinder; Singh, Jagdev
2018-07-01
In this work, an experimental investigation is carried out with R134a and LPG refrigerant mixture for depicting mass flow rate through straight and helical coil adiabatic capillary tubes in a vapor compression refrigeration system. Various experiments were conducted under steady-state conditions, by changing capillary tube length, inner diameter, coil diameter and degree of subcooling. The results showed that mass flow rate through helical coil capillary tube was found lower than straight capillary tube by about 5-16%. Dimensionless correlation and Artificial Neural Network (ANN) models were developed to predict mass flow rate. It was found that dimensionless correlation and ANN model predictions agreed well with experimental results and brought out an absolute fraction of variance of 0.961 and 0.988, root mean square error of 0.489 and 0.275 and mean absolute percentage error of 4.75% and 2.31% respectively. The results suggested that ANN model shows better statistical prediction than dimensionless correlation model.
On the impact of relatedness on SNP association analysis.
Gross, Arnd; Tönjes, Anke; Scholz, Markus
2017-12-06
When testing for SNP (single nucleotide polymorphism) associations in related individuals, observations are not independent. Simple linear regression assuming independent normally distributed residuals results in an increased type I error and the power of the test is also affected in a more complicate manner. Inflation of type I error is often successfully corrected by genomic control. However, this reduces the power of the test when relatedness is of concern. In the present paper, we derive explicit formulae to investigate how heritability and strength of relatedness contribute to variance inflation of the effect estimate of the linear model. Further, we study the consequences of variance inflation on hypothesis testing and compare the results with those of genomic control correction. We apply the developed theory to the publicly available HapMap trio data (N=129), the Sorbs (a self-contained population with N=977 characterised by a cryptic relatedness structure) and synthetic family studies with different sample sizes (ranging from N=129 to N=999) and different degrees of relatedness. We derive explicit and easily to apply approximation formulae to estimate the impact of relatedness on the variance of the effect estimate of the linear regression model. Variance inflation increases with increasing heritability. Relatedness structure also impacts the degree of variance inflation as shown for example family structures. Variance inflation is smallest for HapMap trios, followed by a synthetic family study corresponding to the trio data but with larger sample size than HapMap. Next strongest inflation is observed for the Sorbs, and finally, for a synthetic family study with a more extreme relatedness structure but with similar sample size as the Sorbs. Type I error increases rapidly with increasing inflation. However, for smaller significance levels, power increases with increasing inflation while the opposite holds for larger significance levels. When genomic control is applied, type I error is preserved while power decreases rapidly with increasing variance inflation. Stronger relatedness as well as higher heritability result in increased variance of the effect estimate of simple linear regression analysis. While type I error rates are generally inflated, the behaviour of power is more complex since power can be increased or reduced in dependence on relatedness and the heritability of the phenotype. Genomic control cannot be recommended to deal with inflation due to relatedness. Although it preserves type I error, the loss in power can be considerable. We provide a simple formula for estimating variance inflation given the relatedness structure and the heritability of a trait of interest. As a rule of thumb, variance inflation below 1.05 does not require correction and simple linear regression analysis is still appropriate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erickson, Jason P.; Carlson, Deborah K.; Ortiz, Anne
Accurate location of seismic events is crucial for nuclear explosion monitoring. There are several sources of error in seismic location that must be taken into account to obtain high confidence results. Most location techniques account for uncertainties in the phase arrival times (measurement error) and the bias of the velocity model (model error), but they do not account for the uncertainty of the velocity model bias. By determining and incorporating this uncertainty in the location algorithm we seek to improve the accuracy of the calculated locations and uncertainty ellipses. In order to correct for deficiencies in the velocity model, itmore » is necessary to apply station specific corrections to the predicted arrival times. Both master event and multiple event location techniques assume that the station corrections are known perfectly, when in reality there is an uncertainty associated with these corrections. For multiple event location algorithms that calculate station corrections as part of the inversion, it is possible to determine the variance of the corrections. The variance can then be used to weight the arrivals associated with each station, thereby giving more influence to stations with consistent corrections. We have modified an existing multiple event location program (based on PMEL, Pavlis and Booker, 1983). We are exploring weighting arrivals with the inverse of the station correction standard deviation as well using the conditional probability of the calculated station corrections. This is in addition to the weighting already given to the measurement and modeling error terms. We re-locate a group of mining explosions that occurred at Black Thunder, Wyoming, and compare the results to those generated without accounting for station correction uncertainty.« less
Verification of models for ballistic movement time and endpoint variability.
Lin, Ray F; Drury, Colin G
2013-01-01
A hand control movement is composed of several ballistic movements. The time required in performing a ballistic movement and its endpoint variability are two important properties in developing movement models. The purpose of this study was to test potential models for predicting these two properties. Twelve participants conducted ballistic movements of specific amplitudes using a drawing tablet. The measured data of movement time and endpoint variability were then used to verify the models. This study was successful with Hoffmann and Gan's movement time model (Hoffmann, 1981; Gan and Hoffmann 1988) predicting more than 90.7% data variance for 84 individual measurements. A new theoretically developed ballistic movement variability model, proved to be better than Howarth, Beggs, and Bowden's (1971) model, predicting on average 84.8% of stopping-variable error and 88.3% of aiming-variable errors. These two validated models will help build solid theoretical movement models and evaluate input devices. This article provides better models for predicting end accuracy and movement time of ballistic movements that are desirable in rapid aiming tasks, such as keying in numbers on a smart phone. The models allow better design of aiming tasks, for example button sizes on mobile phones for different user populations.
Gap-filling methods to impute eddy covariance flux data by preserving variance.
NASA Astrophysics Data System (ADS)
Kunwor, S.; Staudhammer, C. L.; Starr, G.; Loescher, H. W.
2015-12-01
To represent carbon dynamics, in terms of exchange of CO2 between the terrestrial ecosystem and the atmosphere, eddy covariance (EC) data has been collected using eddy flux towers from various sites across globe for more than two decades. However, measurements from EC data are missing for various reasons: precipitation, routine maintenance, or lack of vertical turbulence. In order to have estimates of net ecosystem exchange of carbon dioxide (NEE) with high precision and accuracy, robust gap-filling methods to impute missing data are required. While the methods used so far have provided robust estimates of the mean value of NEE, little attention has been paid to preserving the variance structures embodied by the flux data. Preserving the variance of these data will provide unbiased and precise estimates of NEE over time, which mimic natural fluctuations. We used a non-linear regression approach with moving windows of different lengths (15, 30, and 60-days) to estimate non-linear regression parameters for one year of flux data from a long-leaf pine site at the Joseph Jones Ecological Research Center. We used as our base the Michaelis-Menten and Van't Hoff functions. We assessed the potential physiological drivers of these parameters with linear models using micrometeorological predictors. We then used a parameter prediction approach to refine the non-linear gap-filling equations based on micrometeorological conditions. This provides us an opportunity to incorporate additional variables, such as vapor pressure deficit (VPD) and volumetric water content (VWC) into the equations. Our preliminary results indicate that improvements in gap-filling can be gained with a 30-day moving window with additional micrometeorological predictors (as indicated by lower root mean square error (RMSE) of the predicted values of NEE). Our next steps are to use these parameter predictions from moving windows to gap-fill the data with and without incorporation of potential driver variables of the parameters traditionally used. Then, comparisons of the predicted values from these methods and 'traditional' gap-filling methods (using 12 fixed monthly windows) will be assessed to show the scale of preserving variance. Further, this method will be applied to impute artificially created gaps for analyzing if variance is preserved.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
Combining clinical variables to optimize prediction of antidepressant treatment outcomes.
Iniesta, Raquel; Malki, Karim; Maier, Wolfgang; Rietschel, Marcella; Mors, Ole; Hauser, Joanna; Henigsberg, Neven; Dernovsek, Mojca Zvezdana; Souery, Daniel; Stahl, Daniel; Dobson, Richard; Aitchison, Katherine J; Farmer, Anne; Lewis, Cathryn M; McGuffin, Peter; Uher, Rudolf
2016-07-01
The outcome of treatment with antidepressants varies markedly across people with the same diagnosis. A clinically significant prediction of outcomes could spare the frustration of trial and error approach and improve the outcomes of major depressive disorder through individualized treatment selection. It is likely that a combination of multiple predictors is needed to achieve such prediction. We used elastic net regularized regression to optimize prediction of symptom improvement and remission during treatment with escitalopram or nortriptyline and to identify contributing predictors from a range of demographic and clinical variables in 793 adults with major depressive disorder. A combination of demographic and clinical variables, with strong contributions from symptoms of depressed mood, reduced interest, decreased activity, indecisiveness, pessimism and anxiety significantly predicted treatment outcomes, explaining 5-10% of variance in symptom improvement with escitalopram. Similar combinations of variables predicted remission with area under the curve 0.72, explaining approximately 15% of variance (pseudo R(2)) in who achieves remission, with strong contributions from body mass index, appetite, interest-activity symptom dimension and anxious-somatizing depression subtype. Escitalopram-specific outcome prediction was more accurate than generic outcome prediction, and reached effect sizes that were near or above a previously established benchmark for clinical significance. Outcome prediction on the nortriptyline arm did not significantly differ from chance. These results suggest that easily obtained demographic and clinical variables can predict therapeutic response to escitalopram with clinically meaningful accuracy, suggesting a potential for individualized prescription of this antidepressant drug. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Suparman, Yusep; Folmer, Henk; Oud, Johan H. L.
2014-01-01
Omitted variables and measurement errors in explanatory variables frequently occur in hedonic price models. Ignoring these problems leads to biased estimators. In this paper, we develop a constrained autoregression-structural equation model (ASEM) to handle both types of problems. Standard panel data models to handle omitted variables bias are based on the assumption that the omitted variables are time-invariant. ASEM allows handling of both time-varying and time-invariant omitted variables by constrained autoregression. In the case of measurement error, standard approaches require additional external information which is usually difficult to obtain. ASEM exploits the fact that panel data are repeatedly measured which allows decomposing the variance of a variable into the true variance and the variance due to measurement error. We apply ASEM to estimate a hedonic housing model for urban Indonesia. To get insight into the consequences of measurement error and omitted variables, we compare the ASEM estimates with the outcomes of (1) a standard SEM, which does not account for omitted variables, (2) a constrained autoregression model, which does not account for measurement error, and (3) a fixed effects hedonic model, which ignores measurement error and time-varying omitted variables. The differences between the ASEM estimates and the outcomes of the three alternative approaches are substantial.
NASA Technical Reports Server (NTRS)
Alston, D. W.
1981-01-01
The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.
Yu, Shaohui; Xiao, Xue; Ding, Hong; Xu, Ge; Li, Haixia; Liu, Jing
2017-08-05
The quantitative analysis is very difficult for the emission-excitation fluorescence spectroscopy of multi-component mixtures whose fluorescence peaks are serious overlapping. As an effective method for the quantitative analysis, partial least squares can extract the latent variables from both the independent variables and the dependent variables, so it can model for multiple correlations between variables. However, there are some factors that usually affect the prediction results of partial least squares, such as the noise, the distribution and amount of the samples in calibration set etc. This work focuses on the problems in the calibration set that are mentioned above. Firstly, the outliers in the calibration set are removed by leave-one-out cross-validation. Then, according to two different prediction requirements, the EWPLS method and the VWPLS method are proposed. The independent variables and dependent variables are weighted in the EWPLS method by the maximum error of the recovery rate and weighted in the VWPLS method by the maximum variance of the recovery rate. Three organic matters with serious overlapping excitation-emission fluorescence spectroscopy are selected for the experiments. The step adjustment parameter, the iteration number and the sample amount in the calibration set are discussed. The results show the EWPLS method and the VWPLS method are superior to the PLS method especially for the case of small samples in the calibration set. Copyright © 2017 Elsevier B.V. All rights reserved.
Kang, Le; Chen, Weijie; Petrick, Nicholas A; Gallas, Brandon D
2015-02-20
The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. Copyright © 2014 John Wiley & Sons, Ltd.
Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?
2017-01-01
Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making. PMID:28837692
NASA Astrophysics Data System (ADS)
Alexander, R. B.; Boyer, E. W.; Schwarz, G. E.; Smith, R. A.
2013-12-01
Estimating water and material stores and fluxes in watershed studies is frequently complicated by uncertainties in quantifying hydrological and biogeochemical effects of factors such as land use, soils, and climate. Although these process-related effects are commonly measured and modeled in separate catchments, researchers are especially challenged by their complexity across catchments and diverse environmental settings, leading to a poor understanding of how model parameters and prediction uncertainties vary spatially. To address these concerns, we illustrate the use of Bayesian hierarchical modeling techniques with a dynamic version of the spatially referenced watershed model SPARROW (SPAtially Referenced Regression On Watershed attributes). The dynamic SPARROW model is designed to predict streamflow and other water cycle components (e.g., evapotranspiration, soil and groundwater storage) for monthly varying hydrological regimes, using mechanistic functions, mass conservation constraints, and statistically estimated parameters. In this application, the model domain includes nearly 30,000 NHD (National Hydrologic Data) stream reaches and their associated catchments in the Susquehanna River Basin. We report the results of our comparisons of alternative models of varying complexity, including models with different explanatory variables as well as hierarchical models that account for spatial and temporal variability in model parameters and variance (error) components. The model errors are evaluated for changes with season and catchment size and correlations in time and space. The hierarchical models consist of a two-tiered structure in which climate forcing parameters are modeled as random variables, conditioned on watershed properties. Quantification of spatial and temporal variations in the hydrological parameters and model uncertainties in this approach leads to more efficient (lower variance) and less biased model predictions throughout the river network. Moreover, predictions of water-balance components are reported according to probabilistic metrics (e.g., percentiles, prediction intervals) that include both parameter and model uncertainties. These improvements in predictions of streamflow dynamics can inform the development of more accurate predictions of spatial and temporal variations in biogeochemical stores and fluxes (e.g., nutrients and carbon) in watersheds.
1951-05-01
prccedur&:s to be of hipn accuracy. Ambij;uity of subject responizes due to overlap of entries on tU,, record sheets vas negligible. Handwriting ...experimental variables on reading errors us carried out by analysis of variance methods. For this purpose it was convenient to consider different classes...on any scale - an error ofY one numbered division. For this reason, the result. of the analysis of variance of the /10’s errors by dial types may
Sensitivity analysis of periodic errors in heterodyne interferometry
NASA Astrophysics Data System (ADS)
Ganguly, Vasishta; Kim, Nam Ho; Kim, Hyo Soo; Schmitz, Tony
2011-03-01
Periodic errors in heterodyne displacement measuring interferometry occur due to frequency mixing in the interferometer. These nonlinearities are typically characterized as first- and second-order periodic errors which cause a cyclical (non-cumulative) variation in the reported displacement about the true value. This study implements an existing analytical periodic error model in order to identify sensitivities of the first- and second-order periodic errors to the input parameters, including rotational misalignments of the polarizing beam splitter and mixing polarizer, non-orthogonality of the two laser frequencies, ellipticity in the polarizations of the two laser beams, and different transmission coefficients in the polarizing beam splitter. A local sensitivity analysis is first conducted to examine the sensitivities of the periodic errors with respect to each input parameter about the nominal input values. Next, a variance-based approach is used to study the global sensitivities of the periodic errors by calculating the Sobol' sensitivity indices using Monte Carlo simulation. The effect of variation in the input uncertainty on the computed sensitivity indices is examined. It is seen that the first-order periodic error is highly sensitive to non-orthogonality of the two linearly polarized laser frequencies, while the second-order error is most sensitive to the rotational misalignment between the laser beams and the polarizing beam splitter. A particle swarm optimization technique is finally used to predict the possible setup imperfections based on experimentally generated values for periodic errors.
Modelling road accidents: An approach using structural time series
NASA Astrophysics Data System (ADS)
Junus, Noor Wahida Md; Ismail, Mohd Tahir
2014-09-01
In this paper, the trend of road accidents in Malaysia for the years 2001 until 2012 was modelled using a structural time series approach. The structural time series model was identified using a stepwise method, and the residuals for each model were tested. The best-fitted model was chosen based on the smallest Akaike Information Criterion (AIC) and prediction error variance. In order to check the quality of the model, a data validation procedure was performed by predicting the monthly number of road accidents for the year 2012. Results indicate that the best specification of the structural time series model to represent road accidents is the local level with a seasonal model.
The development of a Kalman filter clock predictor
NASA Technical Reports Server (NTRS)
Davis, John A.; Greenhall, Charles A.; Boudjemaa, Redoane
2005-01-01
A Kalman filter based clock predictor is developed, and its performance evaluated using both simulated and real data. The clock predictor is shown to possess a neat to optimal Prediction Error Variance (PEV) when the underlying noise consists of one of the power law noise processes commonly encountered in time and frequency measurements. The predictor's performance is the presence of multiple noise processes is also examined. The relationship between the PEV obtained in the presence of multiple noise processes and those obtained for the individual component noise processes is examined. Comparisons are made with a simple linear clock predictor. The clock predictor is used to predict future values of the time offset between pairs of NPL's active hydrogen masers.
ERIC Educational Resources Information Center
Brockmann, Frank
2011-01-01
State testing programs today are more extensive than ever, and their results are required to serve more purposes and high-stakes decisions than one might have imagined. Assessment results are used to hold schools, districts, and states accountable for student performance and to help guide a multitude of important decisions. This report describes…
Statistically Self-Consistent and Accurate Errors for SuperDARN Data
NASA Astrophysics Data System (ADS)
Reimer, A. S.; Hussey, G. C.; McWilliams, K. A.
2018-01-01
The Super Dual Auroral Radar Network (SuperDARN)-fitted data products (e.g., spectral width and velocity) are produced using weighted least squares fitting. We present a new First-Principles Fitting Methodology (FPFM) that utilizes the first-principles approach of Reimer et al. (2016) to estimate the variance of the real and imaginary components of the mean autocorrelation functions (ACFs) lags. SuperDARN ACFs fitted by the FPFM do not use ad hoc or empirical criteria. Currently, the weighting used to fit the ACF lags is derived from ad hoc estimates of the ACF lag variance. Additionally, an overcautious lag filtering criterion is used that sometimes discards data that contains useful information. In low signal-to-noise (SNR) and/or low signal-to-clutter regimes the ad hoc variance and empirical criterion lead to underestimated errors for the fitted parameter because the relative contributions of signal, noise, and clutter to the ACF variance is not taken into consideration. The FPFM variance expressions include contributions of signal, noise, and clutter. The clutter is estimated using the maximal power-based self-clutter estimator derived by Reimer and Hussey (2015). The FPFM was successfully implemented and tested using synthetic ACFs generated with the radar data simulator of Ribeiro, Ponomarenko, et al. (2013). The fitted parameters and the fitted-parameter errors produced by the FPFM are compared with the current SuperDARN fitting software, FITACF. Using self-consistent statistical analysis, the FPFM produces reliable or trustworthy quantitative measures of the errors of the fitted parameters. For an SNR in excess of 3 dB and velocity error below 100 m/s, the FPFM produces 52% more data points than FITACF.
Assessing Multivariate Constraints to Evolution across Ten Long-Term Avian Studies
Teplitsky, Celine; Tarka, Maja; Møller, Anders P.; Nakagawa, Shinichi; Balbontín, Javier; Burke, Terry A.; Doutrelant, Claire; Gregoire, Arnaud; Hansson, Bengt; Hasselquist, Dennis; Gustafsson, Lars; de Lope, Florentino; Marzal, Alfonso; Mills, James A.; Wheelwright, Nathaniel T.; Yarrall, John W.; Charmantier, Anne
2014-01-01
Background In a rapidly changing world, it is of fundamental importance to understand processes constraining or facilitating adaptation through microevolution. As different traits of an organism covary, genetic correlations are expected to affect evolutionary trajectories. However, only limited empirical data are available. Methodology/Principal Findings We investigate the extent to which multivariate constraints affect the rate of adaptation, focusing on four morphological traits often shown to harbour large amounts of genetic variance and considered to be subject to limited evolutionary constraints. Our data set includes unique long-term data for seven bird species and a total of 10 populations. We estimate population-specific matrices of genetic correlations and multivariate selection coefficients to predict evolutionary responses to selection. Using Bayesian methods that facilitate the propagation of errors in estimates, we compare (1) the rate of adaptation based on predicted response to selection when including genetic correlations with predictions from models where these genetic correlations were set to zero and (2) the multivariate evolvability in the direction of current selection to the average evolvability in random directions of the phenotypic space. We show that genetic correlations on average decrease the predicted rate of adaptation by 28%. Multivariate evolvability in the direction of current selection was systematically lower than average evolvability in random directions of space. These significant reductions in the rate of adaptation and reduced evolvability were due to a general nonalignment of selection and genetic variance, notably orthogonality of directional selection with the size axis along which most (60%) of the genetic variance is found. Conclusions These results suggest that genetic correlations can impose significant constraints on the evolution of avian morphology in wild populations. This could have important impacts on evolutionary dynamics and hence population persistence in the face of rapid environmental change. PMID:24608111
Integrating models that depend on variable data
NASA Astrophysics Data System (ADS)
Banks, A. T.; Hill, M. C.
2016-12-01
Models of human-Earth systems are often developed with the goal of predicting the behavior of one or more dependent variables from multiple independent variables, processes, and parameters. Often dependent variable values range over many orders of magnitude, which complicates evaluation of the fit of the dependent variable values to observations. Many metrics and optimization methods have been proposed to address dependent variable variability, with little consensus being achieved. In this work, we evaluate two such methods: log transformation (based on the dependent variable being log-normally distributed with a constant variance) and error-based weighting (based on a multi-normal distribution with variances that tend to increase as the dependent variable value increases). Error-based weighting has the advantage of encouraging model users to carefully consider data errors, such as measurement and epistemic errors, while log-transformations can be a black box for typical users. Placing the log-transformation into the statistical perspective of error-based weighting has not formerly been considered, to the best of our knowledge. To make the evaluation as clear and reproducible as possible, we use multiple linear regression (MLR). Simulations are conducted with MatLab. The example represents stream transport of nitrogen with up to eight independent variables. The single dependent variable in our example has values that range over 4 orders of magnitude. Results are applicable to any problem for which individual or multiple data types produce a large range of dependent variable values. For this problem, the log transformation produced good model fit, while some formulations of error-based weighting worked poorly. Results support previous suggestions fthat error-based weighting derived from a constant coefficient of variation overemphasizes low values and degrades model fit to high values. Applying larger weights to the high values is inconsistent with the log-transformation. Greater consistency is obtained by imposing smaller (by up to a factor of 1/35) weights on the smaller dependent-variable values. From an error-based perspective, the small weights are consistent with large standard deviations. This work considers the consequences of these two common ways of addressing variable data.
Improved Event Location Uncertainty Estimates
2008-06-30
throughout this study . The data set consists of GT0-2 nuclear explosions from the SAIC Nuclear Explosion Database (www.rdss.info, Bahavar et al...errors: Bias and variance In this study SNR dependence of both delay and variance of reading errors of first arriving P waves are analyzed and...ground truth and range of event size. For other datasets we turn to estimates based on double- differences between arrival times of station pairs
Estimation and Simulation of Slow Crack Growth Parameters from Constant Stress Rate Data
NASA Technical Reports Server (NTRS)
Salem, Jonathan A.; Weaver, Aaron S.
2003-01-01
Closed form, approximate functions for estimating the variances and degrees-of-freedom associated with the slow crack growth parameters n, D, B, and A(sup *) as measured using constant stress rate ('dynamic fatigue') testing were derived by using propagation of errors. Estimates made with the resulting functions and slow crack growth data for a sapphire window were compared to the results of Monte Carlo simulations. The functions for estimation of the variances of the parameters were derived both with and without logarithmic transformation of the initial slow crack growth equations. The transformation was performed to make the functions both more linear and more normal. Comparison of the Monte Carlo results and the closed form expressions derived with propagation of errors indicated that linearization is not required for good estimates of the variances of parameters n and D by the propagation of errors method. However, good estimates variances of the parameters B and A(sup *) could only be made when the starting slow crack growth equation was transformed and the coefficients of variation of the input parameters were not too large. This was partially a result of the skewered distributions of B and A(sup *). Parametric variation of the input parameters was used to determine an acceptable range for using closed form approximate equations derived from propagation of errors.
On the performance of digital phase locked loops in the threshold region
NASA Technical Reports Server (NTRS)
Hurst, G. T.; Gupta, S. C.
1974-01-01
Extended Kalman filter algorithms are used to obtain a digital phase lock loop structure for demodulation of angle modulated signals. It is shown that the error variance equations obtained directly from this structure enable one to predict threshold if one retains higher frequency terms. This is in sharp contrast to the similar analysis of the analog phase lock loop, where the higher frequency terms are filtered out because of the low pass filter in the loop. Results are compared to actual simulation results and threshold region results obtained previously.
Frequency noise measurement of diode-pumped Nd:YAG ring lasers
NASA Technical Reports Server (NTRS)
Chen, Chien-Chung; Win, Moe Zaw
1990-01-01
The combined frequency noise spectrum of two model 120-01A nonplanar ring oscillator lasers was measured by first heterodyne detecting the IF signal and then measuring the IF frequency noise using an RF frequency discriminator. The results indicated the presence of a 1/f-squared noise component in the power-spectral density of the frequency fluctuations between 1 Hz and 1 kHz. After incorporating this 1/f-squared into the analysis of the optical phase tracking loop, the measured phase error variance closely matches the theoretical predictions.
NASA Astrophysics Data System (ADS)
Wang, Ting; Xiang, Jie; Fei, Jianfang; Wang, Yi; Liu, Chunxia; Li, Yuanxiang
2017-12-01
This paper presents an evaluation of the observational impacts on blended sea surface winds from a two-dimensional variational data assimilation (2D-Var) scheme. We begin by briefly introducing the analysis sensitivity with respect to observations in variational data assimilation systems and its relationship with the degrees of freedom for signal (DFS), and then the DFS concept is applied to the 2D-Var sea surface wind blending scheme. Two methods, a priori and a posteriori, are used to estimate the DFS of the zonal ( u) and meridional ( v) components of winds in the 2D-Var blending scheme. The a posteriori method can obtain almost the same results as the a priori method. Because only by-products of the blending scheme are used for the a posteriori method, the computation time is reduced significantly. The magnitude of the DFS is critically related to the observational and background error statistics. Changing the observational and background error variances can affect the DFS value. Because the observation error variances are assumed to be uniform, the observational influence at each observational location is related to the background error variance, and the observations located at the place where there are larger background error variances have larger influences. The average observational influence of u and v with respect to the analysis is about 40%, implying that the background influence with respect to the analysis is about 60%.
Precision of Four Acoustic Bone Measurement Devices
NASA Technical Reports Server (NTRS)
Miller, Christopher; Feiveson, Alan H.; Shackelford, Linda; Rianon, Nahida; LeBlanc, Adrian
2000-01-01
Though many studies have quantified the precision of various acoustic bone measurement devices, it is difficult to directly compare the results among the studies, because they used disparate subject pools, did not specify the estimation methodology, or did not use consistent definitions for various precision characteristics. In this study, we used a repeated measures design protocol to directly determine the precision characteristics of four acoustic bone measurement devices: the Mechanical Response Tissue Analyzer (MRTA), the UBA-575+, the SoundScan 2000 (S2000), and the Sahara Ultrasound Done Analyzer. Ten men and ten women were scanned on all four devices by two different operators at five discrete time points: Week 1, Week 2, Week 3, Month 3 and Month 6. The percent coefficient of variation (%CV) and standardized coefficient of variation were computed for the following precision characteristics: interoperator effect, operator-subject interaction, short-term error variance, and long-term drift, The MRTA had high interoperator errors for its ulnar and tibial stiffness measures and a large long-term drift in its tibial stiffness measurement. The UBA-575+ exhibited large short-term error variances and long-term drift for all three of its measurements. The S2000's tibial speed of sound measurement showed a high short-term error variance and a significant operator-subject interaction but very good values ( < 1%) for the other precision characteristics. The Sahara seemed to have the best overall performance, but was hampered by a large %CV for short-term error variance in its broadband ultrasound attenuation measure.
Precision of Four Acoustic Bone Measurement Devices
NASA Technical Reports Server (NTRS)
Miller, Christopher; Rianon, Nahid; Feiveson, Alan; Shackelford, Linda; LeBlanc, Adrian
2000-01-01
Though many studies have quantified the precision of various acoustic bone measurement devices, it is difficult to directly compare the results among the studies, because they used disparate subject pools, did not specify the estimation methodology, or did not use consistent definitions for various precision characteristics. In this study, we used a repeated measures design protocol to directly determine the precision characteristics of four acoustic bone measurement devices: the Mechanical Response Tissue Analyzer (MRTA), the UBA-575+, the SoundScan 2000 (S2000), and the Sahara Ultrasound Bone Analyzer. Ten men and ten women were scanned on all four devices by two different operators at five discrete time points: Week 1, Week 2, Week 3, Month 3 and Month 6. The percent coefficient of variation (%CV) and standardized coefficient of variation were computed for the following precision characteristics: interoperator effect, operator-subject interaction, short-term error variance, and long-term drift. The MRTA had high interoperator errors for its ulnar and tibial stiffness measures and a large long-term drift in its tibial stiffness measurement. The UBA-575+ exhibited large short-term error variances and long-term drift for all three of its measurements. The S2000's tibial speed of sound measurement showed a high short-term error variance and a significant operator-subject interaction but very good values (less than 1%) for the other precision characteristics. The Sahara seemed to have the best overall performance, but was hampered by a large %CV for short-term error variance in its broadband ultrasound attenuation measure.
Optimal Tuner Selection for Kalman Filter-Based Aircraft Engine Performance Estimation
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Garg, Sanjay
2010-01-01
A linear point design methodology for minimizing the error in on-line Kalman filter-based aircraft engine performance estimation applications is presented. This technique specifically addresses the underdetermined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine which seeks to minimize the theoretical mean-squared estimation error. This paper derives theoretical Kalman filter estimation error bias and variance values at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared to the conventional approach of tuner selection. Experimental simulation results are found to be in agreement with theoretical predictions. The new methodology is shown to yield a significant improvement in on-line engine performance estimation accuracy
NASA Astrophysics Data System (ADS)
Rexer, Moritz; Hirt, Christian
2015-09-01
Classical degree variance models (such as Kaula's rule or the Tscherning-Rapp model) often rely on low-resolution gravity data and so are subject to extrapolation when used to describe the decay of the gravity field at short spatial scales. This paper presents a new degree variance model based on the recently published GGMplus near-global land areas 220 m resolution gravity maps (Geophys Res Lett 40(16):4279-4283, 2013). We investigate and use a 2D-DFT (discrete Fourier transform) approach to transform GGMplus gravity grids into degree variances. The method is described in detail and its approximation errors are studied using closed-loop experiments. Focus is placed on tiling, azimuth averaging, and windowing effects in the 2D-DFT method and on analytical fitting of degree variances. Approximation errors of the 2D-DFT procedure on the (spherical harmonic) degree variance are found to be at the 10-20 % level. The importance of the reference surface (sphere, ellipsoid or topography) of the gravity data for correct interpretation of degree variance spectra is highlighted. The effect of the underlying mass arrangement (spherical or ellipsoidal approximation) on the degree variances is found to be crucial at short spatial scales. A rule-of-thumb for transformation of spectra between spherical and ellipsoidal approximation is derived. Application of the 2D-DFT on GGMplus gravity maps yields a new degree variance model to degree 90,000. The model is supported by GRACE, GOCE, EGM2008 and forward-modelled gravity at 3 billion land points over all land areas within the SRTM data coverage and provides gravity signal variances at the surface of the topography. The model yields omission errors of 9 mGal for gravity (1.5 cm for geoid effects) at scales of 10 km, 4 mGal (1 mm) at 2-km scales, and 2 mGal (0.2 mm) at 1-km scales.
Palmprint Based Multidimensional Fuzzy Vault Scheme
Liu, Hailun; Sun, Dongmei; Xiong, Ke; Qiu, Zhengding
2014-01-01
Fuzzy vault scheme (FVS) is one of the most popular biometric cryptosystems for biometric template protection. However, error correcting code (ECC) proposed in FVS is not appropriate to deal with real-valued biometric intraclass variances. In this paper, we propose a multidimensional fuzzy vault scheme (MDFVS) in which a general subspace error-tolerant mechanism is designed and embedded into FVS to handle intraclass variances. Palmprint is one of the most important biometrics; to protect palmprint templates; a palmprint based MDFVS implementation is also presented. Experimental results show that the proposed scheme not only can deal with intraclass variances effectively but also could maintain the accuracy and meanwhile enhance security. PMID:24892094
Schädler, Marc R; Warzybok, Anna; Kollmeier, Birger
2018-01-01
The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than -20 dB could not be predicted.
Schädler, Marc R.; Warzybok, Anna; Kollmeier, Birger
2018-01-01
The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than −20 dB could not be predicted. PMID:29692200
Chang, Chingching
2015-01-01
This article introduces an integrated inaccuracy typology to explore the prevalence of inaccurate news coverage of health research. This typology suggests that errors, omissions, and misinterpretations are three common types of inaccuracy; errors and omissions are objective, whereas misinterpretations are subjective. Objective inaccuracy involves errors and omissions in describing the background or substantive information about the research, such as how, when, where, and on whom research was conducted. Subjective inaccuracy entails misinterpretations as a result of a lack of expertise among journalists (e.g., misstating facts, errors in inferences, offering speculations as facts) or media's interest in profits (e.g., overemphasis on unique findings, overgeneralizations of findings, shifting emphases). For this study, coders analyzed objective inaccuracy, while scientists rated subjective inaccuracy. In turn, it identifies what can account for the variance in scientists' perceptions of inaccuracy in news articles citing their research. Objective and subjective inaccuracy offer significant predictors. Of the different types of objective inaccuracy, omissions of research methods represent a significant factor, whereas of the types of subjective inaccuracy, errors in inferences, overemphasis on uniqueness, and overgeneralizations of findings are all significant predictors.
Combining forecast weights: Why and how?
NASA Astrophysics Data System (ADS)
Yin, Yip Chee; Kok-Haur, Ng; Hock-Eam, Lim
2012-09-01
This paper proposes a procedure called forecast weight averaging which is a specific combination of forecast weights obtained from different methods of constructing forecast weights for the purpose of improving the accuracy of pseudo out of sample forecasting. It is found that under certain specified conditions, forecast weight averaging can lower the mean squared forecast error obtained from model averaging. In addition, we show that in a linear and homoskedastic environment, this superior predictive ability of forecast weight averaging holds true irrespective whether the coefficients are tested by t statistic or z statistic provided the significant level is within the 10% range. By theoretical proofs and simulation study, we have shown that model averaging like, variance model averaging, simple model averaging and standard error model averaging, each produces mean squared forecast error larger than that of forecast weight averaging. Finally, this result also holds true marginally when applied to business and economic empirical data sets, Gross Domestic Product (GDP growth rate), Consumer Price Index (CPI) and Average Lending Rate (ALR) of Malaysia.
Visentin, G; Penasa, M; Gottardo, P; Cassandro, M; De Marchi, M
2016-10-01
Milk minerals and coagulation properties are important for both consumers and processors, and they can aid in increasing milk added value. However, large-scale monitoring of these traits is hampered by expensive and time-consuming reference analyses. The objective of the present study was to develop prediction models for major mineral contents (Ca, K, Mg, Na, and P) and milk coagulation properties (MCP: rennet coagulation time, curd-firming time, and curd firmness) using mid-infrared spectroscopy. Individual milk samples (n=923) of Holstein-Friesian, Brown Swiss, Alpine Grey, and Simmental cows were collected from single-breed herds between January and December 2014. Reference analysis for the determination of both mineral contents and MCP was undertaken with standardized methods. For each milk sample, the mid-infrared spectrum in the range from 900 to 5,000cm(-1) was stored. Prediction models were calibrated using partial least squares regression coupled with a wavenumber selection technique called uninformative variable elimination, to improve model accuracy, and validated both internally and externally. The average reduction of wavenumbers used in partial least squares regression was 80%, which was accompanied by an average increment of 20% of the explained variance in external validation. The proportion of explained variance in external validation was about 70% for P, K, Ca, and Mg, and it was lower (40%) for Na. Milk coagulation properties prediction models explained between 54% (rennet coagulation time) and 56% (curd-firming time) of the total variance in external validation. The ratio of standard deviation of each trait to the respective root mean square error of prediction, which is an indicator of the predictive ability of an equation, suggested that the developed models might be effective for screening and collection of milk minerals and coagulation properties at the population level. Although prediction equations were not accurate enough to be proposed for analytic purposes, mid-infrared spectroscopy predictions could be evaluated as phenotypic information to genetically improve milk minerals and MCP on a large scale. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
Improving lidar turbulence estimates for wind energy
NASA Astrophysics Data System (ADS)
Newman, J. F.; Clifton, A.; Churchfield, M. J.; Klein, P.
2016-09-01
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidars were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.
Improving Lidar Turbulence Estimates for Wind Energy: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Newman, Jennifer; Clifton, Andrew; Churchfield, Matthew
2016-10-01
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidarsmore » were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.« less
Improving Lidar Turbulence Estimates for Wind Energy
Newman, Jennifer F.; Clifton, Andrew; Churchfield, Matthew J.; ...
2016-10-03
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidarsmore » were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.« less
Texture metric that predicts target detection performance
NASA Astrophysics Data System (ADS)
Culpepper, Joanne B.
2015-12-01
Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.
USDA-ARS?s Scientific Manuscript database
We proposed a method to estimate the error variance among non-replicated genotypes, thus to estimate the genetic parameters by using replicated controls. We derived formulas to estimate sampling variances of the genetic parameters. Computer simulation indicated that the proposed methods of estimatin...
Online Estimation of Allan Variance Coefficients Based on a Neural-Extended Kalman Filter
Miao, Zhiyong; Shen, Feng; Xu, Dingjie; He, Kunpeng; Tian, Chunmiao
2015-01-01
As a noise analysis method for inertial sensors, the traditional Allan variance method requires the storage of a large amount of data and manual analysis for an Allan variance graph. Although the existing online estimation methods avoid the storage of data and the painful procedure of drawing slope lines for estimation, they require complex transformations and even cause errors during the modeling of dynamic Allan variance. To solve these problems, first, a new state-space model that directly models the stochastic errors to obtain a nonlinear state-space model was established for inertial sensors. Then, a neural-extended Kalman filter algorithm was used to estimate the Allan variance coefficients. The real noises of an ADIS16405 IMU and fiber optic gyro-sensors were analyzed by the proposed method and traditional methods. The experimental results show that the proposed method is more suitable to estimate the Allan variance coefficients than the traditional methods. Moreover, the proposed method effectively avoids the storage of data and can be easily implemented using an online processor. PMID:25625903
Improving Genomic Prediction in Cassava Field Experiments by Accounting for Interplot Competition
Elias, Ani A.; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2018-01-01
Plants competing for available resources is an unavoidable phenomenon in a field. We conducted studies in cassava (Manihot esculenta Crantz) in order to understand the pattern of this competition. Taking into account the competitive ability of genotypes while selecting parents for breeding advancement or commercialization can be very useful. We assumed that competition could occur at two levels: (i) the genotypic level, which we call interclonal, and (ii) the plot level irrespective of the type of genotype, which we call interplot competition or competition error. Modification in incidence matrices was applied in order to relate neighboring genotype/plot to the performance of a target genotype/plot with respect to its competitive ability. This was added into a genomic selection (GS) model to simultaneously predict the direct and competitive ability of a genotype. Predictability of the models was tested through a 10-fold cross-validation method repeated five times. The best model was chosen as the one with the lowest prediction root mean squared error (pRMSE) compared to that of the base model having no competitive component. Results from our real data studies indicated that <10% increase in accuracy was achieved with GS-interclonal competition model, but this value reached up to 25% with a GS-competition error model. We also found that the competitive influence of a cassava clone is not just limited to the adjacent neighbors but spreads beyond them. Through simulations, we found that a 26% increase of accuracy in estimating trait genotypic effect can be achieved even in the presence of high competitive variance. PMID:29358232
Improving Genomic Prediction in Cassava Field Experiments by Accounting for Interplot Competition.
Elias, Ani A; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2018-03-02
Plants competing for available resources is an unavoidable phenomenon in a field. We conducted studies in cassava ( Manihot esculenta Crantz) in order to understand the pattern of this competition. Taking into account the competitive ability of genotypes while selecting parents for breeding advancement or commercialization can be very useful. We assumed that competition could occur at two levels: (i) the genotypic level, which we call interclonal, and (ii) the plot level irrespective of the type of genotype, which we call interplot competition or competition error. Modification in incidence matrices was applied in order to relate neighboring genotype/plot to the performance of a target genotype/plot with respect to its competitive ability. This was added into a genomic selection (GS) model to simultaneously predict the direct and competitive ability of a genotype. Predictability of the models was tested through a 10-fold cross-validation method repeated five times. The best model was chosen as the one with the lowest prediction root mean squared error (pRMSE) compared to that of the base model having no competitive component. Results from our real data studies indicated that <10% increase in accuracy was achieved with GS-interclonal competition model, but this value reached up to 25% with a GS-competition error model. We also found that the competitive influence of a cassava clone is not just limited to the adjacent neighbors but spreads beyond them. Through simulations, we found that a 26% increase of accuracy in estimating trait genotypic effect can be achieved even in the presence of high competitive variance. Copyright © 2018 Elias et al.
Bayesian inversions of a dynamic vegetation model in four European grassland sites
NASA Astrophysics Data System (ADS)
Minet, J.; Laloy, E.; Tychon, B.; François, L.
2015-01-01
Eddy covariance data from four European grassland sites are used to probabilistically invert the CARAIB dynamic vegetation model (DVM) with ten unknown parameters, using the DREAM(ZS) Markov chain Monte Carlo (MCMC) sampler. We compare model inversions considering both homoscedastic and heteroscedastic eddy covariance residual errors, with variances either fixed a~priori or jointly inferred with the model parameters. Agreements between measured and simulated data during calibration are comparable with previous studies, with root-mean-square error (RMSE) of simulated daily gross primary productivity (GPP), ecosystem respiration (RECO) and evapotranspiration (ET) ranging from 1.73 to 2.19 g C m-2 day-1, 1.04 to 1.56 g C m-2 day-1, and 0.50 to 1.28 mm day-1, respectively. In validation, mismatches between measured and simulated data are larger, but still with Nash-Sutcliffe efficiency scores above 0.5 for three out of the four sites. Although measurement errors associated with eddy covariance data are known to be heteroscedastic, we showed that assuming a classical linear heteroscedastic model of the residual errors in the inversion do not fully remove heteroscedasticity. Since the employed heteroscedastic error model allows for larger deviations between simulated and measured data as the magnitude of the measured data increases, this error model expectedly lead to poorer data fitting compared to inversions considering a constant variance of the residual errors. Furthermore, sampling the residual error variances along with model parameters results in overall similar model parameter posterior distributions as those obtained by fixing these variances beforehand, while slightly improving model performance. Despite the fact that the calibrated model is generally capable of fitting the data within measurement errors, systematic bias in the model simulations are observed. These are likely due to model inadequacies such as shortcomings in the photosynthesis modelling. Besides model behaviour, difference between model parameter posterior distributions among the four grassland sites are also investigated. It is shown that the marginal distributions of the specific leaf area and characteristic mortality time parameters can be explained by site-specific ecophysiological characteristics. Lastly, the possibility of finding a common set of parameters among the four experimental sites is discussed.
Seasonal prediction of winter haze days in the north central North China Plain
NASA Astrophysics Data System (ADS)
Yin, Zhicong; Wang, Huijun
2016-11-01
Recently, the winter (December-February) haze pollution over the north central North China Plain (NCP) has become severe. By treating the year-to-year increment as the predictand, two new statistical schemes were established using the multiple linear regression (MLR) and the generalized additive model (GAM). By analyzing the associated increment of atmospheric circulation, seven leading predictors were selected to predict the upcoming winter haze days over the NCP (WHDNCP). After cross validation, the root mean square error and explained variance of the MLR (GAM) prediction model was 3.39 (3.38) and 53 % (54 %), respectively. For the final predicted WHDNCP, both of these models could capture the interannual and interdecadal trends and the extremums successfully. Independent prediction tests for 2014 and 2015 also confirmed the good predictive skill of the new schemes. The predicted bias of the MLR (GAM) prediction model in 2014 and 2015 was 0.09 (-0.07) and -3.33 (-1.01), respectively. Compared to the MLR model, the GAM model had a higher predictive skill in reproducing the rapid and continuous increase of WHDNCP after 2010.
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Delle Monache, L.; Alessandrini, S.
2016-12-01
Accuracy of weather forecasts in Northeast U.S. has become very important in recent years, given the serious and devastating effects of extreme weather events. Despite the use of evolved forecasting tools and techniques strengthened by increased super-computing resources, the weather forecasting systems still have their limitations in predicting extreme events. In this study, we examine the combination of analog ensemble and Bayesian regression techniques to improve the prediction of storms that have impacted NE U.S., mostly defined by the occurrence of high wind speeds (i.e. blizzards, winter storms, hurricanes and thunderstorms). The predicted wind speed, wind direction and temperature by two state-of-the-science atmospheric models (WRF and RAMS/ICLAMS) are combined using the mentioned techniques, exploring various ways that those variables influence the minimization of the prediction error (systematic and random). This study is focused on retrospective simulations of 146 storms that affected the NE U.S. in the period 2005-2016. In order to evaluate the techniques, leave-one-out cross validation procedure was implemented regarding 145 storms as the training dataset. The analog ensemble method selects a set of past observations that corresponded to the best analogs of the numerical weather prediction and provides a set of ensemble members of the selected observation dataset. The set of ensemble members can then be used in a deterministic or probabilistic way. In the Bayesian regression framework, optimal variances are estimated for the training partition by minimizing the root mean square error and are applied to the out-of-sample storm. The preliminary results indicate a significant improvement in the statistical metrics of 10-m wind speed for 146 storms using both techniques (20-30% bias and error reduction in all observation-model pairs). In this presentation, we discuss the various combinations of atmospheric predictors and techniques and illustrate how the long record of predicted storms is valuable in the improvement of wind speed prediction.
Neural networks: further insights into error function, generalized weights and others
2016-01-01
The article is a continuum of a previous one providing further insights into the structure of neural network (NN). Key concepts of NN including activation function, error function, learning rate and generalized weights are introduced. NN topology can be visualized with generic plot() function by passing a “nn” class object. Generalized weights assist interpretation of NN model with respect to the independent effect of individual input variables. A large variance of generalized weights for a covariate indicates non-linearity of its independent effect. If generalized weights of a covariate are approximately zero, the covariate is considered to have no effect on outcome. Finally, prediction of new observations can be performed using compute() function. Make sure that the feature variables passed to the compute() function are in the same order to that in the training NN. PMID:27668220
Nonlinear problems in data-assimilation : Can synchronization help?
NASA Astrophysics Data System (ADS)
Tribbia, J. J.; Duane, G. S.
2009-12-01
Over the past several years, operational weather centers have initiated ensemble prediction and assimilation techniques to estimate the error covariance of forecasts in the short and the medium range. The ensemble techniques used are based on linear methods. The theory This technique s been shown to be a useful indicator of skill in the linear range where forecast errors are small relative to climatological variance. While this advance has been impressive, there are still ad hoc aspects of its use in practice, like the need for covariance inflation which are troubling. Furthermore, to be of utility in the nonlinear range an ensemble assimilation and prediction method must be capable of giving probabilistic information for the situation where a probability density forecast becomes multi-modal. A prototypical, simplest example of such a situation is the planetary-wave regime transition where the pdf is bimodal. Our recent research show how the inconsistencies and extensions of linear methodology can be consistently treated using the paradigm of synchronization which views the problems of assimilation and forecasting as that of optimizing the forecast model state with respect to the future evolution of the atmosphere.
Lognormal Kalman filter for assimilating phase space density data in the radiation belts
NASA Astrophysics Data System (ADS)
Kondrashov, D.; Ghil, M.; Shprits, Y.
2011-11-01
Data assimilation combines a physical model with sparse observations and has become an increasingly important tool for scientists and engineers in the design, operation, and use of satellites and other high-technology systems in the near-Earth space environment. Of particular importance is predicting fluxes of high-energy particles in the Van Allen radiation belts, since these fluxes can damage spaceborne platforms and instruments during strong geomagnetic storms. In transiting from a research setting to operational prediction of these fluxes, improved data assimilation is of the essence. The present study is motivated by the fact that phase space densities (PSDs) of high-energy electrons in the outer radiation belt—both simulated and observed—are subject to spatiotemporal variations that span several orders of magnitude. Standard data assimilation methods that are based on least squares minimization of normally distributed errors may not be adequate for handling the range of these variations. We propose herein a modification of Kalman filtering that uses a log-transformed, one-dimensional radial diffusion model for the PSDs and includes parameterized losses. The proposed methodology is first verified on model-simulated, synthetic data and then applied to actual satellite measurements. When the model errors are sufficiently smaller then observational errors, our methodology can significantly improve analysis and prediction skill for the PSDs compared to those of the standard Kalman filter formulation. This improvement is documented by monitoring the variance of the innovation sequence.
Derivation of an analytic expression for the error associated with the noise reduction rating
NASA Astrophysics Data System (ADS)
Murphy, William J.
2005-04-01
Hearing protection devices are assessed using the Real Ear Attenuation at Threshold (REAT) measurement procedure for the purpose of estimating the amount of noise reduction provided when worn by a subject. The rating number provided on the protector label is a function of the mean and standard deviation of the REAT results achieved by the test subjects. If a group of subjects have a large variance, then it follows that the certainty of the rating should be correspondingly lower. No estimate of the error of a protector's rating is given by existing standards or regulations. Propagation of errors was applied to the Noise Reduction Rating to develop an analytic expression for the hearing protector rating error term. Comparison of the analytic expression for the error to the standard deviation estimated from Monte Carlo simulation of subject attenuations yielded a linear relationship across several protector types and assumptions for the variance of the attenuations.
McGinitie, Teague M; Ebrahimi-Najafabadi, Heshmatollah; Harynuk, James J
2014-02-21
A new method for calibrating thermodynamic data to be used in the prediction of analyte retention times is presented. The method allows thermodynamic data collected on one column to be used in making predictions across columns of the same stationary phase but with varying geometries. This calibration is essential as slight variances in the column inner diameter and stationary phase film thickness between columns or as a column ages will adversely affect the accuracy of predictions. The calibration technique uses a Grob standard mixture along with a Nelder-Mead simplex algorithm and a previously developed model of GC retention times based on a three-parameter thermodynamic model to estimate both inner diameter and stationary phase film thickness. The calibration method is highly successful with the predicted retention times for a set of alkanes, ketones and alcohols having an average error of 1.6s across three columns. Copyright © 2014 Elsevier B.V. All rights reserved.
Austin, Peter C
2016-12-30
Propensity score methods are used to reduce the effects of observed confounding when using observational data to estimate the effects of treatments or exposures. A popular method of using the propensity score is inverse probability of treatment weighting (IPTW). When using this method, a weight is calculated for each subject that is equal to the inverse of the probability of receiving the treatment that was actually received. These weights are then incorporated into the analyses to minimize the effects of observed confounding. Previous research has found that these methods result in unbiased estimation when estimating the effect of treatment on survival outcomes. However, conventional methods of variance estimation were shown to result in biased estimates of standard error. In this study, we conducted an extensive set of Monte Carlo simulations to examine different methods of variance estimation when using a weighted Cox proportional hazards model to estimate the effect of treatment. We considered three variance estimation methods: (i) a naïve model-based variance estimator; (ii) a robust sandwich-type variance estimator; and (iii) a bootstrap variance estimator. We considered estimation of both the average treatment effect and the average treatment effect in the treated. We found that the use of a bootstrap estimator resulted in approximately correct estimates of standard errors and confidence intervals with the correct coverage rates. The other estimators resulted in biased estimates of standard errors and confidence intervals with incorrect coverage rates. Our simulations were informed by a case study examining the effect of statin prescribing on mortality. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Environmental Influences on Well-Being: A Dyadic Latent Panel Analysis of Spousal Similarity
ERIC Educational Resources Information Center
Schimmack, Ulrich; Lucas, Richard E.
2010-01-01
This article uses dyadic latent panel analysis (DLPA) to examine environmental influences on well-being. DLPA requires longitudinal dyadic data. It decomposes the observed variance of both members of a dyad into a trait, state, and an error component. Furthermore, state variance is decomposed into initial and new state variance. Total observed…
Global Sensitivity Analysis and Parameter Calibration for an Ecosystem Carbon Model
NASA Astrophysics Data System (ADS)
Safta, C.; Ricciuto, D. M.; Sargsyan, K.; Najm, H. N.; Debusschere, B.; Thornton, P. E.
2013-12-01
We present uncertainty quantification results for a process-based ecosystem carbon model. The model employs 18 parameters and is driven by meteorological data corresponding to years 1992-2006 at the Harvard Forest site. Daily Net Ecosystem Exchange (NEE) observations were available to calibrate the model parameters and test the performance of the model. Posterior distributions show good predictive capabilities for the calibrated model. A global sensitivity analysis was first performed to determine the important model parameters based on their contribution to the variance of NEE. We then proceed to calibrate the model parameters in a Bayesian framework. The daily discrepancies between measured and predicted NEE values were modeled as independent and identically distributed Gaussians with prescribed daily variance according to the recorded instrument error. All model parameters were assumed to have uninformative priors with bounds set according to expert opinion. The global sensitivity results show that the rate of leaf fall (LEAFALL) is responsible for approximately 25% of the total variance in the average NEE for 1992-2005. A set of 4 other parameters, Nitrogen use efficiency (NUE), base rate for maintenance respiration (BR_MR), growth respiration fraction (RG_FRAC), and allocation to plant stem pool (ASTEM) contribute between 5% and 12% to the variance in average NEE, while the rest of the parameters have smaller contributions. The posterior distributions, sampled with a Markov Chain Monte Carlo algorithm, exhibit significant correlations between model parameters. However LEAFALL, the most important parameter for the average NEE, is not informed by the observational data, while less important parameters show significant updates between their prior and posterior densities. The Fisher information matrix values, indicating which parameters are most informed by the experimental observations, are examined to augment the comparison between the calibration and global sensitivity analysis results.
Risk factors for near-miss events and safety incidents in pediatric radiation therapy.
Baig, Nimrah; Wang, Jiangxia; Elnahal, Shereef; McNutt, Todd; Wright, Jean; DeWeese, Theodore; Terezakis, Stephanie
2018-05-01
Factors contributing to safety- or quality-related incidents (e.g. variances) in children are unknown. We identified clinical and RT treatment variables associated with risk for variances in a pediatric cohort. Using our institution's incident learning system, 81 patients age ≤21 years old who experienced variances were compared to 191 pediatric patients without variances. Clinical and RT treatment variables were evaluated as potential predictors for variances using univariate and multivariate analyses. Variances were primarily documentation errors (n = 46, 57%) and were most commonly detected during treatment planning (n = 14, 21%). Treatment planning errors constituted the majority (n = 16 out of 29, 55%) of near-misses and safety incidents (NMSI), which excludes workflow incidents. Therapists reported the majority of variances (n = 50, 62%). Physician cross-coverage (OR = 2.1, 95% CI = 1.04-4.38) and 3D conformal RT (OR = 2.3, 95% CI = 1.11-4.69) increased variance risk. Conversely, age >14 years (OR = 0.5, 95% CI = 0.28-0.88) and diagnosis of abdominal tumor (OR = 0.2, 95% CI = 0.04-0.59) decreased variance risk. Variances in children occurred in early treatment phases, but were detected at later workflow stages. Quality measures should be implemented during early treatment phases with a focus on younger children and those cared for by cross-covering physicians. Copyright © 2018 Elsevier B.V. All rights reserved.
Testing and extension of a sea lamprey feeding model
Cochran, Philip A.; Swink, William D.; Kinziger, Andrew P.
1999-01-01
A previous model of feeding by sea lamprey Petromyzon marinus predicted energy intake and growth by lampreys as a function of lamprey size, host size, and duration of feeding attachments, but it was applicable only to lampreys feeding at 10°C and it was tested against only a single small data set of limited scope. We extended the model to other temperatures and tested it against an extensive data set (more than 700 feeding bouts) accumulated during experiments with captive sea lampreys. Model predictions of instantaneous growth were highly correlated with observed growth, and a partitioning of mean squared error between model predictions and observed results showed that 88.5% of the variance was due to random variation rather than to systematic errors. However, deviations between observed and predicted values varied substantially, especially for short feeding bouts. Predicted and observed growth trajectories of individual lampreys during multiple feeding bouts during the summer tended to correspond closely, but predicted growth was generally much higher than observed growth late in the year. This suggests the possibility that large overwintering lampreys reduce their feeding rates while attached to hosts. Seasonal or size-related shifts in the fate of consumed energy may provide an alternative explanation. The lamprey feeding model offers great flexibility in assessing growth of captive lampreys within various experimental protocols (e.g., different host species or thermal regimes) because it controls for individual differences in feeding history.
NASA Technical Reports Server (NTRS)
Osborne, William P.
1994-01-01
The use of 8 and 16 PSK TCM to support satellite communications in an effort to achieve more bandwidth efficiency in a power-limited channel has been proposed. This project addresses the problem of carrier phase jitter in an M-PSK receiver utilizing the high SNR approximation to the maximum aposteriori estimation of carrier phase. In particular, numerical solutions to the 8 and 16 PSK self-noise and phase detector gain in the carrier tracking loop are presented. The effect of changing SNR on the loop noise bandwidth is also discussed. These data are then used to compute variance of phase error as a function of SNR. Simulation and hardware data are used to verify these calculations. The results show that there is a threshold in the variance of phase error versus SNR curves that is a strong function of SNR and a weak function of loop bandwidth. The M-PSK variance thresholds occur at SNR's in the range of practical interest for the use of 8 and 16-PSK TCM. This suggests that phase error variance is an important consideration in the design of these systems.
Genetic basis of between-individual and within-individual variance of docility.
Martin, J G A; Pirotta, E; Petelle, M B; Blumstein, D T
2017-04-01
Between-individual variation in phenotypes within a population is the basis of evolution. However, evolutionary and behavioural ecologists have mainly focused on estimating between-individual variance in mean trait and neglected variation in within-individual variance, or predictability of a trait. In fact, an important assumption of mixed-effects models used to estimate between-individual variance in mean traits is that within-individual residual variance (predictability) is identical across individuals. Individual heterogeneity in the predictability of behaviours is a potentially important effect but rarely estimated and accounted for. We used 11 389 measures of docility behaviour from 1576 yellow-bellied marmots (Marmota flaviventris) to estimate between-individual variation in both mean docility and its predictability. We then implemented a double hierarchical animal model to decompose the variances of both mean trait and predictability into their environmental and genetic components. We found that individuals differed both in their docility and in their predictability of docility with a negative phenotypic covariance. We also found significant genetic variance for both mean docility and its predictability but no genetic covariance between the two. This analysis is one of the first to estimate the genetic basis of both mean trait and within-individual variance in a wild population. Our results indicate that equal within-individual variance should not be assumed. We demonstrate the evolutionary importance of the variation in the predictability of docility and illustrate potential bias in models ignoring variation in predictability. We conclude that the variability in the predictability of a trait should not be ignored, and present a coherent approach for its quantification. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
NASA Astrophysics Data System (ADS)
Pan, X. G.; Wang, J. Q.; Zhou, H. Y.
2013-05-01
The variance component estimation (VCE) based on semi-parametric estimator with weighted matrix of data depth has been proposed, because the coupling system model error and gross error exist in the multi-source heterogeneous measurement data of space and ground combined TT&C (Telemetry, Tracking and Command) technology. The uncertain model error has been estimated with the semi-parametric estimator model, and the outlier has been restrained with the weighted matrix of data depth. On the basis of the restriction of the model error and outlier, the VCE can be improved and used to estimate weighted matrix for the observation data with uncertain model error or outlier. Simulation experiment has been carried out under the circumstance of space and ground combined TT&C. The results show that the new VCE based on the model error compensation can determine the rational weight of the multi-source heterogeneous data, and restrain the outlier data.
2011-03-01
1.179 1 22 .289 POP-UP .000 1 22 .991 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design ...POP-UP 2.104 1 22 .161 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design : Intercept... design also limited the number of intended treatments. The experimental design originally was suppose to test all three adverse events that threaten
Information Processing from Infancy to 11 Years: Continuities and Prediction of IQ
Rose, Susan A.; Feldman, Judith F.; Jankowski, Jeffery J.; Van Rossem, Ronan
2012-01-01
This study provides the first direct evidence of cognitive continuity for multiple specific information processing abilities from infancy and toddlerhood to pre-adolescence, and provides support for the view that infant abilities and form the basis of later childhood abilities. Data from a large sample of children (N = 131) were obtained at five different time points (7, 12, 24, 36 months, and 11 years) for a large battery of tasks representing four cognitive domains (attention, processing speed, memory, and representational competence). Structural equation models of continuity were assessed for each domain, in which it was assumed that infant abilities → toddler abilities → 11-year abilities. Abilities at each age were represented by latent variables, which minimize task-specific variance and measurement error. The model for each domain fit the data. Moreover, abilities from the three age periods predicted global outcome, with infant, toddler, and contemporaneous 11-year measures, respectively, accounting for 12.3%, 18.5%, and 45.2% of the variance in 11-year IQ. These findings strengthen contentions that specific cognitive abilities that can be identified in infancy show long-term continuity and contribute importantly to later cognitive competence. PMID:23162179
Predictability of the Lagrangian Motion in the Upper Ocean
NASA Astrophysics Data System (ADS)
Piterbarg, L. I.; Griffa, A.; Griffa, A.; Mariano, A. J.; Ozgokmen, T. M.; Ryan, E. H.
2001-12-01
The complex non-linear dynamics of the upper ocean leads to chaotic behavior of drifter trajectories in the ocean. Our study is focused on estimating the predictability limit for the position of an individual Lagrangian particle or a particle cluster based on the knowledge of mean currents and observations of nearby particles (predictors). The Lagrangian prediction problem, besides being a fundamental scientific problem, is also of great importance for practical applications such as search and rescue operations and for modeling the spread of fish larvae. A stochastic multi-particle model for the Lagrangian motion has been rigorously formulated and is a generalization of the well known "random flight" model for a single particle. Our model is mathematically consistent and includes a few easily interpreted parameters, such as the Lagrangian velocity decorrelation time scale, the turbulent velocity variance, and the velocity decorrelation radius, that can be estimated from data. The top Lyapunov exponent for an isotropic version of the model is explicitly expressed as a function of these parameters enabling us to approximate the predictability limit to first order. Lagrangian prediction errors for two new prediction algorithms are evaluated against simple algorithms and each other and are used to test the predictability limits of the stochastic model for isotropic turbulence. The first algorithm is based on a Kalman filter and uses the developed stochastic model. Its implementation for drifter clusters in both the Tropical Pacific and Adriatic Sea, showed good prediction skill over a period of 1-2 weeks. The prediction error is primarily a function of the data density, defined as the number of predictors within a velocity decorrelation spatial scale from the particle to be predicted. The second algorithm is model independent and is based on spatial regression considerations. Preliminary results, based on simulated, as well as, real data, indicate that it performs better than the Kalman-based algorithm in strong shear flows. An important component of our research is the optimal predictor location problem; Where should floats be launched in order to minimize the Lagrangian prediction error? Preliminary Lagrangian sampling results for different flow scenarios will be presented.
Machine Learning Estimates of Natural Product Conformational Energies
Rupp, Matthias; Bauer, Matthias R.; Wilcken, Rainer; Lange, Andreas; Reutlinger, Michael; Boeckler, Frank M.; Schneider, Gisbert
2014-01-01
Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium Archangium gephyra as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures. PMID:24453952
Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W; Müller, Klaus-Robert; Lemm, Steven
2013-01-01
Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation.
Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W.; Müller, Klaus-Robert; Lemm, Steven
2013-01-01
Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
ERIC Educational Resources Information Center
Abry, Tashia; Cash, Anne H.; Bradshaw, Catherine P.
2014-01-01
Generalizability theory (GT) offers a useful framework for estimating the reliability of a measure while accounting for multiple sources of error variance. The purpose of this study was to use GT to examine multiple sources of variance in and the reliability of school-level teacher and high school student behaviors as observed using the tool,…
ERIC Educational Resources Information Center
Lix, Lisa M.; And Others
1996-01-01
Meta-analytic techniques were used to summarize the statistical robustness literature on Type I error properties of alternatives to the one-way analysis of variance "F" test. The James (1951) and Welch (1951) tests performed best under violations of the variance homogeneity assumption, although their use is not always appropriate. (SLD)
NASA Astrophysics Data System (ADS)
Gao, Jing; Burt, James E.
2017-12-01
This study investigates the usefulness of a per-pixel bias-variance error decomposition (BVD) for understanding and improving spatially-explicit data-driven models of continuous variables in environmental remote sensing (ERS). BVD is a model evaluation method originated from machine learning and have not been examined for ERS applications. Demonstrated with a showcase regression tree model mapping land imperviousness (0-100%) using Landsat images, our results showed that BVD can reveal sources of estimation errors, map how these sources vary across space, reveal the effects of various model characteristics on estimation accuracy, and enable in-depth comparison of different error metrics. Specifically, BVD bias maps can help analysts identify and delineate model spatial non-stationarity; BVD variance maps can indicate potential effects of ensemble methods (e.g. bagging), and inform efficient training sample allocation - training samples should capture the full complexity of the modeled process, and more samples should be allocated to regions with more complex underlying processes rather than regions covering larger areas. Through examining the relationships between model characteristics and their effects on estimation accuracy revealed by BVD for both absolute and squared errors (i.e. error is the absolute or the squared value of the difference between observation and estimate), we found that the two error metrics embody different diagnostic emphases, can lead to different conclusions about the same model, and may suggest different solutions for performance improvement. We emphasize BVD's strength in revealing the connection between model characteristics and estimation accuracy, as understanding this relationship empowers analysts to effectively steer performance through model adjustments.
Moghtadaei, Motahareh; Hashemi Golpayegani, Mohammad Reza; Malekzadeh, Reza
2013-02-07
Identification of squamous dysplasia and esophageal squamous cell carcinoma (ESCC) is of great importance in prevention of cancer incidence. Computer aided algorithms can be very useful for identification of people with higher risks of squamous dysplasia, and ESCC. Such method can limit the clinical screenings to people with higher risks. Different regression methods have been used to predict ESCC and dysplasia. In this paper, a Fuzzy Neural Network (FNN) model is selected for ESCC and dysplasia prediction. The inputs to the classifier are the risk factors. Since the relation between risk factors in the tumor system has a complex nonlinear behavior, in comparison to most of ordinary data, the cost function of its model can have more local optimums. Thus the need for global optimization methods is more highlighted. The proposed method in this paper is a Chaotic Optimization Algorithm (COA) proceeding by the common Error Back Propagation (EBP) local method. Since the model has many parameters, we use a strategy to reduce the dependency among parameters caused by the chaotic series generator. This dependency was not considered in the previous COA methods. The algorithm is compared with logistic regression model as the latest successful methods of ESCC and dysplasia prediction. The results represent a more precise prediction with less mean and variance of error. Copyright © 2012 Elsevier Ltd. All rights reserved.
Experimental study on an FBG strain sensor
NASA Astrophysics Data System (ADS)
Liu, Hong-lin; Zhu, Zheng-wei; Zheng, Yong; Liu, Bang; Xiao, Feng
2018-01-01
Landslides and other geological disasters occur frequently and often cause high financial and humanitarian cost. The real-time, early-warning monitoring of landslides has important significance in reducing casualties and property losses. In this paper, by taking the high initial precision and high sensitivity advantage of FBG, an FBG strain sensor is designed combining FBGs with inclinometer. The sensor was regarded as a cantilever beam with one end fixed. According to the anisotropic material properties of the inclinometer, a theoretical formula between the FBG wavelength and the deflection of the sensor was established using the elastic mechanics principle. Accuracy of the formula established had been verified through laboratory calibration testing and model slope monitoring experiments. The displacement of landslide could be calculated by the established theoretical formula using the changing values of FBG central wavelength obtained by the demodulation instrument remotely. Results showed that the maximum error at different heights was 9.09%; the average of the maximum error was 6.35%, and its corresponding variance was 2.12; the minimum error was 4.18%; the average of the minimum error was 5.99%, and its corresponding variance was 0.50. The maximum error of the theoretical and the measured displacement decrease gradually, and the variance of the error also decreases gradually. This indicates that the theoretical results are more and more reliable. It also shows that the sensor and the theoretical formula established in this paper can be used for remote, real-time, high precision and early warning monitoring of the slope.
Evaluating causes of error in landmark-based data collection using scanners
Shearer, Brian M.; Cooke, Siobhán B.; Halenar, Lauren B.; Reber, Samantha L.; Plummer, Jeannette E.; Delson, Eric
2017-01-01
In this study, we assess the precision, accuracy, and repeatability of craniodental landmarks (Types I, II, and III, plus curves of semilandmarks) on a single macaque cranium digitally reconstructed with three different surface scanners and a microCT scanner. Nine researchers with varying degrees of osteological and geometric morphometric knowledge landmarked ten iterations of each scan (40 total) to test the effects of scan quality, researcher experience, and landmark type on levels of intra- and interobserver error. Two researchers additionally landmarked ten specimens from seven different macaque species using the same landmark protocol to test the effects of the previously listed variables relative to species-level morphological differences (i.e., observer variance versus real biological variance). Error rates within and among researchers by scan type were calculated to determine whether or not data collected by different individuals or on different digitally rendered crania are consistent enough to be used in a single dataset. Results indicate that scan type does not impact rate of intra- or interobserver error. Interobserver error is far greater than intraobserver error among all individuals, and is similar in variance to that found among different macaque species. Additionally, experience with osteology and morphometrics both positively contribute to precision in multiple landmarking sessions, even where less experienced researchers have been trained in point acquisition. Individual training increases precision (although not necessarily accuracy), and is highly recommended in any situation where multiple researchers will be collecting data for a single project. PMID:29099867
NASA Astrophysics Data System (ADS)
Hemmings, J. C. P.; Challenor, P. G.
2012-04-01
A wide variety of different plankton system models have been coupled with ocean circulation models, with the aim of understanding and predicting aspects of environmental change. However, an ability to make reliable inferences about real-world processes from the model behaviour demands a quantitative understanding of model error that remains elusive. Assessment of coupled model output is inhibited by relatively limited observing system coverage of biogeochemical components. Any direct assessment of the plankton model is further inhibited by uncertainty in the physical state. Furthermore, comparative evaluation of plankton models on the basis of their design is inhibited by the sensitivity of their dynamics to many adjustable parameters. Parameter uncertainty has been widely addressed by calibrating models at data-rich ocean sites. However, relatively little attention has been given to quantifying uncertainty in the physical fields required by the plankton models at these sites, and tendencies in the biogeochemical properties due to the effects of horizontal processes are often neglected. Here we use model twin experiments, in which synthetic data are assimilated to estimate a system's known "true" parameters, to investigate the impact of error in a plankton model's environmental input data. The experiments are supported by a new software tool, the Marine Model Optimization Testbed, designed for rigorous analysis of plankton models in a multi-site 1-D framework. Simulated errors are derived from statistical characterizations of the mixed layer depth, the horizontal flux divergence tendencies of the biogeochemical tracers and the initial state. Plausible patterns of uncertainty in these data are shown to produce strong temporal and spatial variability in the expected simulation error variance over an annual cycle, indicating variation in the significance attributable to individual model-data differences. An inverse scheme using ensemble-based estimates of the simulation error variance to allow for this environment error performs well compared with weighting schemes used in previous calibration studies, giving improved estimates of the known parameters. The efficacy of the new scheme in real-world applications will depend on the quality of statistical characterizations of the input data. Practical approaches towards developing reliable characterizations are discussed.
Ishwaran, Hemant; Lu, Min
2018-06-04
Random forests are a popular nonparametric tree ensemble procedure with broad applications to data analysis. While its widespread popularity stems from its prediction performance, an equally important feature is that it provides a fully nonparametric measure of variable importance (VIMP). A current limitation of VIMP, however, is that no systematic method exists for estimating its variance. As a solution, we propose a subsampling approach that can be used to estimate the variance of VIMP and for constructing confidence intervals. The method is general enough that it can be applied to many useful settings, including regression, classification, and survival problems. Using extensive simulations, we demonstrate the effectiveness of the subsampling estimator and in particular find that the delete-d jackknife variance estimator, a close cousin, is especially effective under low subsampling rates due to its bias correction properties. These 2 estimators are highly competitive when compared with the .164 bootstrap estimator, a modified bootstrap procedure designed to deal with ties in out-of-sample data. Most importantly, subsampling is computationally fast, thus making it especially attractive for big data settings. Copyright © 2018 John Wiley & Sons, Ltd.
Rincent, R; Laloë, D; Nicolas, S; Altmann, T; Brunel, D; Revilla, P; Rodríguez, V M; Moreno-Gonzalez, J; Melchinger, A; Bauer, E; Schoen, C-C; Meyer, N; Giauffret, C; Bauland, C; Jamin, P; Laborde, J; Monod, H; Flament, P; Charcosset, A; Moreau, L
2012-10-01
Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix-best linear unbiased predictions model (RA-BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.
Using state-space models to predict the abundance of juvenile and adult sea lice on Atlantic salmon.
Elghafghuf, Adel; Vanderstichel, Raphael; St-Hilaire, Sophie; Stryhn, Henrik
2018-04-11
Sea lice are marine parasites affecting salmon farms, and are considered one of the most costly pests of the salmon aquaculture industry. Infestations of sea lice on farms significantly increase opportunities for the parasite to spread in the surrounding ecosystem, making control of this pest a challenging issue for salmon producers. The complexity of controlling sea lice on salmon farms requires frequent monitoring of the abundance of different sea lice stages over time. Industry-based data sets of counts of lice are amenable to multivariate time-series data analyses. In this study, two sets of multivariate autoregressive state-space models were applied to Chilean sea lice data from six Atlantic salmon production cycles on five isolated farms (at least 20 km seaway distance away from other known active farms), to evaluate the utility of these models for predicting sea lice abundance over time on farms. The models were constructed with different parameter configurations, and the analysis demonstrated large heterogeneity between production cycles for the autoregressive parameter, the effects of chemotherapeutant bath treatments, and the process-error variance. A model allowing for different parameters across production cycles had the best fit and the smallest overall prediction errors. However, pooling information across cycles for the drift and observation error parameters did not substantially affect model performance, thus reducing the number of necessary parameters in the model. Bath treatments had strong but variable effects for reducing sea lice burdens, and these effects were stronger for adult lice than juvenile lice. Our multivariate state-space models were able to handle different sea lice stages and provide predictions for sea lice abundance with reasonable accuracy up to five weeks out. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Butcher, Peter A; Ivry, Richard B; Kuo, Sheng-Han; Rydz, David; Krakauer, John W; Taylor, Jordan A
2017-09-01
Individuals with damage to the cerebellum perform poorly in sensorimotor adaptation paradigms. This deficit has been attributed to impairment in sensory prediction error-based updating of an internal forward model, a form of implicit learning. These individuals can, however, successfully counter a perturbation when instructed with an explicit aiming strategy. This successful use of an instructed aiming strategy presents a paradox: In adaptation tasks, why do individuals with cerebellar damage not come up with an aiming solution on their own to compensate for their implicit learning deficit? To explore this question, we employed a variant of a visuomotor rotation task in which, before executing a movement on each trial, the participants verbally reported their intended aiming location. Compared with healthy control participants, participants with spinocerebellar ataxia displayed impairments in both implicit learning and aiming. This was observed when the visuomotor rotation was introduced abruptly ( experiment 1 ) or gradually ( experiment 2 ). This dual deficit does not appear to be related to the increased movement variance associated with ataxia: Healthy undergraduates showed little change in implicit learning or aiming when their movement feedback was artificially manipulated to produce similar levels of variability ( experiment 3 ). Taken together the results indicate that a consequence of cerebellar dysfunction is not only impaired sensory prediction error-based learning but also a difficulty in developing and/or maintaining an aiming solution in response to a visuomotor perturbation. We suggest that this dual deficit can be explained by the cerebellum forming part of a network that learns and maintains action-outcome associations across trials. NEW & NOTEWORTHY Individuals with cerebellar pathology are impaired in sensorimotor adaptation. This deficit has been attributed to an impairment in error-based learning, specifically, from a deficit in using sensory prediction errors to update an internal model. Here we show that these individuals also have difficulty in discovering an aiming solution to overcome their adaptation deficit, suggesting a new role for the cerebellum in sensorimotor adaptation tasks. Copyright © 2017 the American Physiological Society.
Prediction uncertainty of plume characteristics derived from a small number of measuring points
NASA Astrophysics Data System (ADS)
French, H. K.; van der Zee, S. E. A. T. M.; Leijnse, A.
A small number of measuring points may inflict a bias on the characterisation of flow and transport based on field experiments in the unsaturated zone. Simulation of pure advective transport of a Gaussian plume through a setup of 30 regularly placed measuring points revealed regular temporal fluctuations about the real spatial moments. An irregular setup predicted both irregular fluctuations and larger discrepancies from the real value. From these considerations, a regular setup is recommended. Spatial moments were sensitive to the plume size relative to the distance between individual measuring points. To reduce prediction errors of the variance, the distance between the measuring points should be less than twice the standard deviation of the examined plume. The total size of the setup should cover several standard deviations of the plume to avoid mass being lost from the monitored area. Numerical simulations of a dispersing plume (comparing calculations based on 9000 nodes with 30 measuring points) revealed that vertical and horizontal centres of mass were predicted well at all degrees of heterogeneity, and the same was the case for horizontal variances. Vertical variances were more susceptible to prediction errors, but estimates were of the same order of magnitude as the real values. Résumé Lorsque l'on cherche à caractériser l'écoulement et le transport à partir d'expériences de terrain dans la zone saturée, il arrive qu'un petit nombre de points introduisent un biais. La simulation d'un transport purement advectif d'un panache gaussien au travers d'un ensemble de 30 points de mesures espacés régulièrement fait apparaître des variations temporelles régulières autour des moments spatiaux réels. Un ensemble irrégulier conduit à prédire à la fois des variations irrégulières et de plus grandes divergences par rapport à la valeur réelle. A partir de ces constations, un ensemble régulier est recommandé. Les moments spatiaux sont apparus sensibles à la dimension du panache en fonction de la distance entre les différents points de mesure. Afin de réduire les erreurs de prédiction de la variance, la distance entre les points de mesure doit être inférieure au double de l'écart-type du panache examiné. La dimension totale de l'ensemble doit couvrir une étendue de plusieurs écarts-types du panache pour éviter qu'une partie de la matière échappe à la zone surveillée. Des simulations numériques du panache en dispersion (les calculs de comparaison sont basés sur 9000 nœuds avec 30 points de mesure) montrent que le centre vertical et le centre horizontal de la matière dispersée ont été bien prédits à tous les degrés d'hétérogénéité, de même que pour les variances horizontales. Les variances verticales ont été plus sensibles aux erreurs de prédiction, mais les estimations étaient du même ordre de grandeur que les valeurs réelles. Resumen Un número pequeño de puntos de medida puede producir un sesgo en la caracterización en campo del flujo y transporte de solutos en la zona no saturada. La simulación de transporte advectivo (no difusivo) de un penacho Gaussiano a travs de un conjunto de 30 puntos de medida regularmente distribuidos revelan fluctuaciones temporales regulares de los momentos espaciales del penacho. Una distribución irregular de puntos de medida predijo a su vez fluctuaciones irregulares, más alejadas de la realidad, por lo que se recomienda el uso de esquemas de muestreo regulares. Los momentos espaciales fueron sensibles a la relación entre tamaño del penacho y distancia entre puntos de medida. Para reducir los errores en la predicción de la varianza, la distancia entre puntos de observación debe ser menor que dos veces la desviación estándar del penacho. El tamaño del área muestreada debe cubrir varias desviaciones estándar del penacho para evitar perder parte de la masa. Las simulaciones numricas en un penacho dispersivo, comparando los cálculos basados en 9000 nudos con las 30 medidas, mostraron que las posiciones de los centros de masa y las varianzas en dirección horizontal se predijeron bien independientemente del grado de heterogeneidad. Las varianzas verticales, sin embargo, fueron más susceptibles a errores de predicción, pero las estimaciones eran del mismo orden de magnitud que los valores reales.
NASA Astrophysics Data System (ADS)
Sahu, Neelesh Kumar; Andhare, Atul B.; Andhale, Sandip; Raju Abraham, Roja
2018-04-01
Present work deals with prediction of surface roughness using cutting parameters along with in-process measured cutting force and tool vibration (acceleration) during turning of Ti-6Al-4V with cubic boron nitride (CBN) inserts. Full factorial design is used for design of experiments using cutting speed, feed rate and depth of cut as design variables. Prediction model for surface roughness is developed using response surface methodology with cutting speed, feed rate, depth of cut, resultant cutting force and acceleration as control variables. Analysis of variance (ANOVA) is performed to find out significant terms in the model. Insignificant terms are removed after performing statistical test using backward elimination approach. Effect of each control variables on surface roughness is also studied. Correlation coefficient (R2 pred) of 99.4% shows that model correctly explains the experiment results and it behaves well even when adjustment is made in factors or new factors are added or eliminated. Validation of model is done with five fresh experiments and measured forces and acceleration values. Average absolute error between RSM model and experimental measured surface roughness is found to be 10.2%. Additionally, an artificial neural network model is also developed for prediction of surface roughness. The prediction results of modified regression model are compared with ANN. It is found that RSM model and ANN (average absolute error 7.5%) are predicting roughness with more than 90% accuracy. From the results obtained it is found that including cutting force and vibration for prediction of surface roughness gives better prediction than considering only cutting parameters. Also, ANN gives better prediction over RSM models.
Sampling design for spatially distributed hydrogeologic and environmental processes
Christakos, G.; Olea, R.A.
1992-01-01
A methodology for the design of sampling networks over space is proposed. The methodology is based on spatial random field representations of nonhomogeneous natural processes, and on optimal spatial estimation techniques. One of the most important results of random field theory for physical sciences is its rationalization of correlations in spatial variability of natural processes. This correlation is extremely important both for interpreting spatially distributed observations and for predictive performance. The extent of site sampling and the types of data to be collected will depend on the relationship of subsurface variability to predictive uncertainty. While hypothesis formulation and initial identification of spatial variability characteristics are based on scientific understanding (such as knowledge of the physics of the underlying phenomena, geological interpretations, intuition and experience), the support offered by field data is statistically modelled. This model is not limited by the geometric nature of sampling and covers a wide range in subsurface uncertainties. A factorization scheme of the sampling error variance is derived, which possesses certain atttactive properties allowing significant savings in computations. By means of this scheme, a practical sampling design procedure providing suitable indices of the sampling error variance is established. These indices can be used by way of multiobjective decision criteria to obtain the best sampling strategy. Neither the actual implementation of the in-situ sampling nor the solution of the large spatial estimation systems of equations are necessary. The required values of the accuracy parameters involved in the network design are derived using reference charts (readily available for various combinations of data configurations and spatial variability parameters) and certain simple yet accurate analytical formulas. Insight is gained by applying the proposed sampling procedure to realistic examples related to sampling problems in two dimensions. ?? 1992.
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Anagnostou, E. N.; Hartman, B.; Kallos, G. B.
2015-12-01
Weather prediction accuracy has become very important for the Northeast U.S. given the devastating effects of extreme weather events in the recent years. Weather forecasting systems are used towards building strategies to prevent catastrophic losses for human lives and the environment. Concurrently, weather forecast tools and techniques have evolved with improved forecast skill as numerical prediction techniques are strengthened by increased super-computing resources. In this study, we examine the combination of two state-of-the-science atmospheric models (WRF and RAMS/ICLAMS) by utilizing a Bayesian regression approach to improve the prediction of extreme weather events for NE U.S. The basic concept behind the Bayesian regression approach is to take advantage of the strengths of two atmospheric modeling systems and, similar to the multi-model ensemble approach, limit their weaknesses which are related to systematic and random errors in the numerical prediction of physical processes. The first part of this study is focused on retrospective simulations of seventeen storms that affected the region in the period 2004-2013. Optimal variances are estimated by minimizing the root mean square error and are applied to out-of-sample weather events. The applicability and usefulness of this approach are demonstrated by conducting an error analysis based on in-situ observations from meteorological stations of the National Weather Service (NWS) for wind speed and wind direction, and NCEP Stage IV radar data, mosaicked from the regional multi-sensor for precipitation. The preliminary results indicate a significant improvement in the statistical metrics of the modeled-observed pairs for meteorological variables using various combinations of the sixteen events as predictors of the seventeenth. This presentation will illustrate the implemented methodology and the obtained results for wind speed, wind direction and precipitation, as well as set the research steps that will be followed in the future.
Jones, Andrew T; Biester, Thomas W; Buyske, Jo; Lewis, Frank R; Malangoni, Mark A
2014-01-01
Although designed as a low-stakes formative examination, the American Board of Surgery In-Training Examination (ABSITE) is often used in high-stakes decisions such as promotion, remediation, and retention owing to its perceived ability to predict the outcome of board certification. Because of the discrepancy between intent and use, the ability of ABSITE scores to predict passing the American Board of Surgery certification examinations was analyzed. All first-time American Board of Surgery qualifying examination (QE) examinees between 2006 and 2012 were reviewed. Examinees' postgraduate year (PGY) 1 and PGY5 ABSITE standard scores were linked to QE scores and pass/fail outcomes (n = 6912 and 6846, respectively) as well as first-time certifying examination (CE) pass/fail results (n = 1329). Linear and logistic regression analyses were performed to evaluate the utility of ABSITE scores to predict board certification scores and pass/fail outcomes. PGY1 ABSITE scores accounted for 22% of the variance in QE scores (p < 0.001). PGY5 scores were a slightly better predictor, accounting for 30% of QE score variance (p < 0.001). Analyses showed that selecting a PGY5 ABSITE score that maximized overall decision accuracy for predicting QE pass/fail outcomes (86% accuracy) resulted in 98% sensitivity, 13% specificity, a positive predictive value of 87%, and a negative predictive value of 57%. ABSITE scores were not predictive of success on the CE. ABSITE scores are a useful predictor of QE scores and outcomes but do not predict passing the CE. Although scoring well on the ABSITE is highly predictive of QE success, using low ABSITE scores to predict QE failure results in frequent decision errors. Program directors and other evaluators should use additional sources of information when making high-stakes decisions about resident performance. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Vardeman, Stephen B.; Wendelberger, Joanne R.
2005-01-01
There is a little-known but very simple generalization of the standard result that for uncorrelated random variables with common mean [mu] and variance [sigma][superscript 2], the expected value of the sample variance is [sigma][superscript 2]. The generalization justifies the use of the usual standard error of the sample mean in possibly…
Deflation as a method of variance reduction for estimating the trace of a matrix inverse
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gambhir, Arjun Singh; Stathopoulos, Andreas; Orginos, Kostas
Many fields require computing the trace of the inverse of a large, sparse matrix. The typical method used for such computations is the Hutchinson method which is a Monte Carlo (MC) averaging over matrix quadratures. To improve its convergence, several variance reductions techniques have been proposed. In this paper, we study the effects of deflating the near null singular value space. We make two main contributions. First, we analyze the variance of the Hutchinson method as a function of the deflated singular values and vectors. Although this provides good intuition in general, by assuming additionally that the singular vectors aremore » random unitary matrices, we arrive at concise formulas for the deflated variance that include only the variance and mean of the singular values. We make the remarkable observation that deflation may increase variance for Hermitian matrices but not for non-Hermitian ones. This is a rare, if not unique, property where non-Hermitian matrices outperform Hermitian ones. The theory can be used as a model for predicting the benefits of deflation. Second, we use deflation in the context of a large scale application of "disconnected diagrams" in Lattice QCD. On lattices, Hierarchical Probing (HP) has previously provided an order of magnitude of variance reduction over MC by removing "error" from neighboring nodes of increasing distance in the lattice. Although deflation used directly on MC yields a limited improvement of 30% in our problem, when combined with HP they reduce variance by a factor of over 150 compared to MC. For this, we pre-computated 1000 smallest singular values of an ill-conditioned matrix of size 25 million. Furthermore, using PRIMME and a domain-specific Algebraic Multigrid preconditioner, we perform one of the largest eigenvalue computations in Lattice QCD at a fraction of the cost of our trace computation.« less
Deflation as a method of variance reduction for estimating the trace of a matrix inverse
Gambhir, Arjun Singh; Stathopoulos, Andreas; Orginos, Kostas
2017-04-06
Many fields require computing the trace of the inverse of a large, sparse matrix. The typical method used for such computations is the Hutchinson method which is a Monte Carlo (MC) averaging over matrix quadratures. To improve its convergence, several variance reductions techniques have been proposed. In this paper, we study the effects of deflating the near null singular value space. We make two main contributions. First, we analyze the variance of the Hutchinson method as a function of the deflated singular values and vectors. Although this provides good intuition in general, by assuming additionally that the singular vectors aremore » random unitary matrices, we arrive at concise formulas for the deflated variance that include only the variance and mean of the singular values. We make the remarkable observation that deflation may increase variance for Hermitian matrices but not for non-Hermitian ones. This is a rare, if not unique, property where non-Hermitian matrices outperform Hermitian ones. The theory can be used as a model for predicting the benefits of deflation. Second, we use deflation in the context of a large scale application of "disconnected diagrams" in Lattice QCD. On lattices, Hierarchical Probing (HP) has previously provided an order of magnitude of variance reduction over MC by removing "error" from neighboring nodes of increasing distance in the lattice. Although deflation used directly on MC yields a limited improvement of 30% in our problem, when combined with HP they reduce variance by a factor of over 150 compared to MC. For this, we pre-computated 1000 smallest singular values of an ill-conditioned matrix of size 25 million. Furthermore, using PRIMME and a domain-specific Algebraic Multigrid preconditioner, we perform one of the largest eigenvalue computations in Lattice QCD at a fraction of the cost of our trace computation.« less
Is Romantic Desire Predictable? Machine Learning Applied to Initial Romantic Attraction.
Joel, Samantha; Eastwick, Paul W; Finkel, Eli J
2017-10-01
Matchmaking companies and theoretical perspectives on close relationships suggest that initial attraction is, to some extent, a product of two people's self-reported traits and preferences. We used machine learning to test how well such measures predict people's overall tendencies to romantically desire other people (actor variance) and to be desired by other people (partner variance), as well as people's desire for specific partners above and beyond actor and partner variance (relationship variance). In two speed-dating studies, romantically unattached individuals completed more than 100 self-report measures about traits and preferences that past researchers have identified as being relevant to mate selection. Each participant met each opposite-sex participant attending a speed-dating event for a 4-min speed date. Random forests models predicted 4% to 18% of actor variance and 7% to 27% of partner variance; crucially, however, they were unable to predict relationship variance using any combination of traits and preferences reported before the dates. These results suggest that compatibility elements of human mating are challenging to predict before two people meet.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vidal-Codina, F., E-mail: fvidal@mit.edu; Nguyen, N.C., E-mail: cuongng@mit.edu; Giles, M.B., E-mail: mike.giles@maths.ox.ac.uk
We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basismore » approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.« less
Song, Lunar; Park, Byeonghwa; Oh, Kyeung Mi
2015-04-01
Serious medication errors continue to exist in hospitals, even though there is technology that could potentially eliminate them such as bar code medication administration. Little is known about the degree to which the culture of patient safety is associated with behavioral intention to use bar code medication administration. Based on the Technology Acceptance Model, this study evaluated the relationships among patient safety culture and perceived usefulness and perceived ease of use, and behavioral intention to use bar code medication administration technology among nurses in hospitals. Cross-sectional surveys with a convenience sample of 163 nurses using bar code medication administration were conducted. Feedback and communication about errors had a positive impact in predicting perceived usefulness (β=.26, P<.01) and perceived ease of use (β=.22, P<.05). In a multiple regression model predicting for behavioral intention, age had a negative impact (β=-.17, P<.05); however, teamwork within hospital units (β=.20, P<.05) and perceived usefulness (β=.35, P<.01) both had a positive impact on behavioral intention. The overall bar code medication administration behavioral intention model explained 24% (P<.001) of the variance. Identified factors influencing bar code medication administration behavioral intention can help inform hospitals to develop tailored interventions for RNs to reduce medication administration errors and increase patient safety by using this technology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Siyuan; Hwang, Youngdeok; Khabibrakhmanov, Ildar
With increasing penetration of solar and wind energy to the total energy supply mix, the pressing need for accurate energy forecasting has become well-recognized. Here we report the development of a machine-learning based model blending approach for statistically combining multiple meteorological models for improving the accuracy of solar/wind power forecast. Importantly, we demonstrate that in addition to parameters to be predicted (such as solar irradiance and power), including additional atmospheric state parameters which collectively define weather situations as machine learning input provides further enhanced accuracy for the blended result. Functional analysis of variance shows that the error of individual modelmore » has substantial dependence on the weather situation. The machine-learning approach effectively reduces such situation dependent error thus produces more accurate results compared to conventional multi-model ensemble approaches based on simplistic equally or unequally weighted model averaging. Validation over an extended period of time results show over 30% improvement in solar irradiance/power forecast accuracy compared to forecasts based on the best individual model.« less
A Posteriori Correction of Forecast and Observation Error Variances
NASA Technical Reports Server (NTRS)
Rukhovets, Leonid
2005-01-01
Proposed method of total observation and forecast error variance correction is based on the assumption about normal distribution of "observed-minus-forecast" residuals (O-F), where O is an observed value and F is usually a short-term model forecast. This assumption can be accepted for several types of observations (except humidity) which are not grossly in error. Degree of nearness to normal distribution can be estimated by the symmetry or skewness (luck of symmetry) a(sub 3) = mu(sub 3)/sigma(sup 3) and kurtosis a(sub 4) = mu(sub 4)/sigma(sup 4) - 3 Here mu(sub i) = i-order moment, sigma is a standard deviation. It is well known that for normal distribution a(sub 3) = a(sub 4) = 0.
Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.
Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia
2017-06-01
Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.
Growth models and the expected distribution of fluctuating asymmetry
Graham, John H.; Shimizu, Kunio; Emlen, John M.; Freeman, D. Carl; Merkel, John
2003-01-01
Multiplicative error accounts for much of the size-scaling and leptokurtosis in fluctuating asymmetry. It arises when growth involves the addition of tissue to that which is already present. Such errors are lognormally distributed. The distribution of the difference between two lognormal variates is leptokurtic. If those two variates are correlated, then the asymmetry variance will scale with size. Inert tissues typically exhibit additive error and have a gamma distribution. Although their asymmetry variance does not exhibit size-scaling, the distribution of the difference between two gamma variates is nevertheless leptokurtic. Measurement error is also additive, but has a normal distribution. Thus, the measurement of fluctuating asymmetry may involve the mixing of additive and multiplicative error. When errors are multiplicative, we recommend computing log E(l) − log E(r), the difference between the logarithms of the expected values of left and right sides, even when size-scaling is not obvious. If l and r are lognormally distributed, and measurement error is nil, the resulting distribution will be normal, and multiplicative error will not confound size-related changes in asymmetry. When errors are additive, such a transformation to remove size-scaling is unnecessary. Nevertheless, the distribution of l − r may still be leptokurtic.
USDA-ARS?s Scientific Manuscript database
If not properly account for, auto-correlated errors in observations can lead to inaccurate results in soil moisture data analysis and reanalysis. Here, we propose a more generalized form of the triple collocation algorithm (GTC) capable of decomposing the total error variance of remotely-sensed surf...
Robust LOD scores for variance component-based linkage analysis.
Blangero, J; Williams, J T; Almasy, L
2000-01-01
The variance component method is now widely used for linkage analysis of quantitative traits. Although this approach offers many advantages, the importance of the underlying assumption of multivariate normality of the trait distribution within pedigrees has not been studied extensively. Simulation studies have shown that traits with leptokurtic distributions yield linkage test statistics that exhibit excessive Type I error when analyzed naively. We derive analytical formulae relating the deviation from the expected asymptotic distribution of the lod score to the kurtosis and total heritability of the quantitative trait. A simple correction constant yields a robust lod score for any deviation from normality and for any pedigree structure, and effectively eliminates the problem of inflated Type I error due to misspecification of the underlying probability model in variance component-based linkage analysis.
Analytic score distributions for a spatially continuous tridirectional Monte Carol transport problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Booth, T.E.
1996-01-01
The interpretation of the statistical error estimates produced by Monte Carlo transport codes is still somewhat of an art. Empirically, there are variance reduction techniques whose error estimates are almost always reliable, and there are variance reduction techniques whose error estimates are often unreliable. Unreliable error estimates usually result from inadequate large-score sampling from the score distribution`s tail. Statisticians believe that more accurate confidence interval statements are possible if the general nature of the score distribution can be characterized. Here, the analytic score distribution for the exponential transform applied to a simple, spatially continuous Monte Carlo transport problem is provided.more » Anisotropic scattering and implicit capture are included in the theory. In large part, the analytic score distributions that are derived provide the basis for the ten new statistical quality checks in MCNP.« less
Wu, J; Awate, S P; Licht, D J; Clouchoux, C; du Plessis, A J; Avants, B B; Vossough, A; Gee, J C; Limperopoulos, C
2015-07-01
Traditional methods of dating a pregnancy based on history or sonographic assessment have a large variation in the third trimester. We aimed to assess the ability of various quantitative measures of brain cortical folding on MR imaging in determining fetal gestational age in the third trimester. We evaluated 8 different quantitative cortical folding measures to predict gestational age in 33 healthy fetuses by using T2-weighted fetal MR imaging. We compared the accuracy of the prediction of gestational age by these cortical folding measures with the accuracy of prediction by brain volume measurement and by a previously reported semiquantitative visual scale of brain maturity. Regression models were constructed, and measurement biases and variances were determined via a cross-validation procedure. The cortical folding measures are accurate in the estimation and prediction of gestational age (mean of the absolute error, 0.43 ± 0.45 weeks) and perform better than (P = .024) brain volume (mean of the absolute error, 0.72 ± 0.61 weeks) or sonography measures (SDs approximately 1.5 weeks, as reported in literature). Prediction accuracy is comparable with that of the semiquantitative visual assessment score (mean, 0.57 ± 0.41 weeks). Quantitative cortical folding measures such as global average curvedness can be an accurate and reliable estimator of gestational age and brain maturity for healthy fetuses in the third trimester and have the potential to be an indicator of brain-growth delays for at-risk fetuses and preterm neonates. © 2015 by American Journal of Neuroradiology.
Statistics of the epoch of reionization 21-cm signal - I. Power spectrum error-covariance
NASA Astrophysics Data System (ADS)
Mondal, Rajesh; Bharadwaj, Somnath; Majumdar, Suman
2016-02-01
The non-Gaussian nature of the epoch of reionization (EoR) 21-cm signal has a significant impact on the error variance of its power spectrum P(k). We have used a large ensemble of seminumerical simulations and an analytical model to estimate the effect of this non-Gaussianity on the entire error-covariance matrix {C}ij. Our analytical model shows that {C}ij has contributions from two sources. One is the usual variance for a Gaussian random field which scales inversely of the number of modes that goes into the estimation of P(k). The other is the trispectrum of the signal. Using the simulated 21-cm Signal Ensemble, an ensemble of the Randomized Signal and Ensembles of Gaussian Random Ensembles we have quantified the effect of the trispectrum on the error variance {C}II. We find that its relative contribution is comparable to or larger than that of the Gaussian term for the k range 0.3 ≤ k ≤ 1.0 Mpc-1, and can be even ˜200 times larger at k ˜ 5 Mpc-1. We also establish that the off-diagonal terms of {C}ij have statistically significant non-zero values which arise purely from the trispectrum. This further signifies that the error in different k modes are not independent. We find a strong correlation between the errors at large k values (≥0.5 Mpc-1), and a weak correlation between the smallest and largest k values. There is also a small anticorrelation between the errors in the smallest and intermediate k values. These results are relevant for the k range that will be probed by the current and upcoming EoR 21-cm experiments.
McManus, I C; Dewberry, Chris; Nicholson, Sandra; Dowell, Jonathan S; Woolf, Katherine; Potts, Henry W W
2013-11-14
Measures used for medical student selection should predict future performance during training. A problem for any selection study is that predictor-outcome correlations are known only in those who have been selected, whereas selectors need to know how measures would predict in the entire pool of applicants. That problem of interpretation can be solved by calculating construct-level predictive validity, an estimate of true predictor-outcome correlation across the range of applicant abilities. Construct-level predictive validities were calculated in six cohort studies of medical student selection and training (student entry, 1972 to 2009) for a range of predictors, including A-levels, General Certificates of Secondary Education (GCSEs)/O-levels, and aptitude tests (AH5 and UK Clinical Aptitude Test (UKCAT)). Outcomes included undergraduate basic medical science and finals assessments, as well as postgraduate measures of Membership of the Royal Colleges of Physicians of the United Kingdom (MRCP(UK)) performance and entry in the Specialist Register. Construct-level predictive validity was calculated with the method of Hunter, Schmidt and Le (2006), adapted to correct for right-censorship of examination results due to grade inflation. Meta-regression analyzed 57 separate predictor-outcome correlations (POCs) and construct-level predictive validities (CLPVs). Mean CLPVs are substantially higher (.450) than mean POCs (.171). Mean CLPVs for first-year examinations, were high for A-levels (.809; CI: .501 to .935), and lower for GCSEs/O-levels (.332; CI: .024 to .583) and UKCAT (mean = .245; CI: .207 to .276). A-levels had higher CLPVs for all undergraduate and postgraduate assessments than did GCSEs/O-levels and intellectual aptitude tests. CLPVs of educational attainment measures decline somewhat during training, but continue to predict postgraduate performance. Intellectual aptitude tests have lower CLPVs than A-levels or GCSEs/O-levels. Educational attainment has strong CLPVs for undergraduate and postgraduate performance, accounting for perhaps 65% of true variance in first year performance. Such CLPVs justify the use of educational attainment measure in selection, but also raise a key theoretical question concerning the remaining 35% of variance (and measurement error, range restriction and right-censorship have been taken into account). Just as in astrophysics, 'dark matter' and 'dark energy' are posited to balance various theoretical equations, so medical student selection must also have its 'dark variance', whose nature is not yet properly characterized, but explains a third of the variation in performance during training. Some variance probably relates to factors which are unpredictable at selection, such as illness or other life events, but some is probably also associated with factors such as personality, motivation or study skills.
The Statistical Power of Planned Comparisons.
ERIC Educational Resources Information Center
Benton, Roberta L.
Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…
On the error in crop acreage estimation using satellite (LANDSAT) data
NASA Technical Reports Server (NTRS)
Chhikara, R. (Principal Investigator)
1983-01-01
The problem of crop acreage estimation using satellite data is discussed. Bias and variance of a crop proportion estimate in an area segment obtained from the classification of its multispectral sensor data are derived as functions of the means, variances, and covariance of error rates. The linear discriminant analysis and the class proportion estimation for the two class case are extended to include a third class of measurement units, where these units are mixed on ground. Special attention is given to the investigation of mislabeling in training samples and its effect on crop proportion estimation. It is shown that the bias and variance of the estimate of a specific crop acreage proportion increase as the disparity in mislabeling rates between two classes increases. Some interaction is shown to take place, causing the bias and the variance to decrease at first and then to increase, as the mixed unit class varies in size from 0 to 50 percent of the total area segment.
A two step Bayesian approach for genomic prediction of breeding values.
Shariati, Mohammad M; Sørensen, Peter; Janss, Luc
2012-05-21
In genomic models that assign an individual variance to each marker, the contribution of one marker to the posterior distribution of the marker variance is only one degree of freedom (df), which introduces many variance parameters with only little information per variance parameter. A better alternative could be to form clusters of markers with similar effects where markers in a cluster have a common variance. Therefore, the influence of each marker group of size p on the posterior distribution of the marker variances will be p df. The simulated data from the 15th QTL-MAS workshop were analyzed such that SNP markers were ranked based on their effects and markers with similar estimated effects were grouped together. In step 1, all markers with minor allele frequency more than 0.01 were included in a SNP-BLUP prediction model. In step 2, markers were ranked based on their estimated variance on the trait in step 1 and each 150 markers were assigned to one group with a common variance. In further analyses, subsets of 1500 and 450 markers with largest effects in step 2 were kept in the prediction model. Grouping markers outperformed SNP-BLUP model in terms of accuracy of predicted breeding values. However, the accuracies of predicted breeding values were lower than Bayesian methods with marker specific variances. Grouping markers is less flexible than allowing each marker to have a specific marker variance but, by grouping, the power to estimate marker variances increases. A prior knowledge of the genetic architecture of the trait is necessary for clustering markers and appropriate prior parameterization.
New Methods for Estimating Seasonal Potential Climate Predictability
NASA Astrophysics Data System (ADS)
Feng, Xia
This study develops two new statistical approaches to assess the seasonal potential predictability of the observed climate variables. One is the univariate analysis of covariance (ANOCOVA) model, a combination of autoregressive (AR) model and analysis of variance (ANOVA). It has the advantage of taking into account the uncertainty of the estimated parameter due to sampling errors in statistical test, which is often neglected in AR based methods, and accounting for daily autocorrelation that is not considered in traditional ANOVA. In the ANOCOVA model, the seasonal signals arising from external forcing are determined to be identical or not to assess any interannual variability that may exist is potentially predictable. The bootstrap is an attractive alternative method that requires no hypothesis model and is available no matter how mathematically complicated the parameter estimator. This method builds up the empirical distribution of the interannual variance from the resamplings drawn with replacement from the given sample, in which the only predictability in seasonal means arises from the weather noise. These two methods are applied to temperature and water cycle components including precipitation and evaporation, to measure the extent to which the interannual variance of seasonal means exceeds the unpredictable weather noise compared with the previous methods, including Leith-Shukla-Gutzler (LSG), Madden, and Katz. The potential predictability of temperature from ANOCOVA model, bootstrap, LSG and Madden exhibits a pronounced tropical-extratropical contrast with much larger predictability in the tropics dominated by El Nino/Southern Oscillation (ENSO) than in higher latitudes where strong internal variability lowers predictability. Bootstrap tends to display highest predictability of the four methods, ANOCOVA lies in the middle, while LSG and Madden appear to generate lower predictability. Seasonal precipitation from ANOCOVA, bootstrap, and Katz, resembling that for temperature, is more predictable over the tropical regions, and less predictable in extropics. Bootstrap and ANOCOVA are in good agreement with each other, both methods generating larger predictability than Katz. The seasonal predictability of evaporation over land bears considerably similarity with that of temperature using ANOCOVA, bootstrap, LSG and Madden. The remote SST forcing and soil moisture reveal substantial seasonality in their relations with the potentially predictable seasonal signals. For selected regions, either SST or soil moisture or both shows significant relationships with predictable signals, hence providing indirect insight on slowly varying boundary processes involved to enable useful seasonal climate predication. A multivariate analysis of covariance (MANOCOVA) model is established to identify distinctive predictable patterns, which are uncorrelated with each other. Generally speaking, the seasonal predictability from multivariate model is consistent with that from ANOCOVA. Besides unveiling the spatial variability of predictability, MANOCOVA model also reveals the temporal variability of each predictable pattern, which could be linked to the periodic oscillations.
Power Measurement Errors on a Utility Aircraft
NASA Technical Reports Server (NTRS)
Bousman, William G.
2002-01-01
Extensive flight test data obtained from two recent performance tests of a UH 60A aircraft are reviewed. A power difference is calculated from the power balance equation and is used to examine power measurement errors. It is shown that the baseline measurement errors are highly non-Gaussian in their frequency distribution and are therefore influenced by additional, unquantified variables. Linear regression is used to examine the influence of other variables and it is shown that a substantial portion of the variance depends upon measurements of atmospheric parameters. Correcting for temperature dependence, although reducing the variance in the measurement errors, still leaves unquantified effects. Examination of the power difference over individual test runs indicates significant errors from drift, although it is unclear how these may be corrected. In an idealized case, where the drift is correctable, it is shown that the power measurement errors are significantly reduced and the error distribution is Gaussian. A new flight test program is recommended that will quantify the thermal environment for all torque measurements on the UH 60. Subsequently, the torque measurement systems will be recalibrated based on the measured thermal environment and a new power measurement assessment performed.
Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H
2015-11-30
We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.
Error in geometric morphometric data collection: Combining data from multiple sources.
Robinson, Chris; Terhune, Claire E
2017-09-01
This study compares two- and three-dimensional morphometric data to determine the extent to which intra- and interobserver and intermethod error influence the outcomes of statistical analyses. Data were collected five times for each method and observer on 14 anthropoid crania using calipers, a MicroScribe, and 3D models created from NextEngine and microCT scans. ANOVA models were used to examine variance in the linear data at the level of genus, species, specimen, observer, method, and trial. Three-dimensional data were analyzed using geometric morphometric methods; principal components analysis was employed to examine how trials of all specimens were distributed in morphospace and Procrustes distances among trials were calculated and used to generate UPGMA trees to explore whether all trials of the same individual grouped together regardless of observer or method. Most variance in the linear data was at the genus level, with greater variance at the observer than method levels. In the 3D data, interobserver and intermethod error were similar to intraspecific distances among Callicebus cupreus individuals, with interobserver error being higher than intermethod error. Generally, taxa separate well in morphospace, with different trials of the same specimen typically grouping together. However, trials of individuals in the same species overlapped substantially with one another. Researchers should be cautious when compiling data from multiple methods and/or observers, especially if analyses are focused on intraspecific variation or closely related species, as in these cases, patterns among individuals may be obscured by interobserver and intermethod error. Conducting interobserver and intermethod reliability assessments prior to the collection of data is recommended. © 2017 Wiley Periodicals, Inc.
Medium-range Performance of the Global NWP Model
NASA Astrophysics Data System (ADS)
Kim, J.; Jang, T.; Kim, J.; Kim, Y.
2017-12-01
The medium-range performance of the global numerical weather prediction (NWP) model in the Korea Meteorological Administration (KMA) is investigated. The performance is based on the prediction of the extratropical circulation. The mean square error is expressed by sum of spatial variance of discrepancy between forecasts and observations and the square of the mean error (ME). Thus, it is important to investigate the ME effect in order to understand the model performance. The ME is expressed by the subtraction of an anomaly from forecast difference against the real climatology. It is found that the global model suffers from a severe systematic ME in medium-range forecasts. The systematic ME is dominant in the entire troposphere in all months. Such ME can explain at most 25% of root mean square error. We also compare the extratropical ME distribution with that from other NWP centers. NWP models exhibit similar spatial ME structure each other. It is found that the spatial ME pattern is highly correlated to that of an anomaly, implying that the ME varies with seasons. For example, the correlation coefficient between ME and anomaly ranges from -0.51 to -0.85 by months. The pattern of the extratropical circulation also has a high correlation to an anomaly. The global model has trouble in faithfully simulating extratropical cyclones and blockings in the medium-range forecast. In particular, the model has a hard to simulate an anomalous event in medium-range forecasts. If we choose an anomalous period for a test-bed experiment, we will suffer from a large error due to an anomaly.
Designing Measurement Studies under Budget Constraints: Controlling Error of Measurement and Power.
ERIC Educational Resources Information Center
Marcoulides, George A.
1995-01-01
A methodology is presented for minimizing the mean error variance-covariance component in studies with resource constraints. The method is illustrated using a one-facet multivariate design. Extensions to other designs are discussed. (SLD)
On the Estimation of Errors in Sparse Bathymetric Geophysical Data Sets
NASA Astrophysics Data System (ADS)
Jakobsson, M.; Calder, B.; Mayer, L.; Armstrong, A.
2001-05-01
There is a growing demand in the geophysical community for better regional representations of the world ocean's bathymetry. However, given the vastness of the oceans and the relative limited coverage of even the most modern mapping systems, it is likely that many of the older data sets will remain part of our cumulative database for several more decades. Therefore, regional bathymetrical compilations that are based on a mixture of historic and contemporary data sets will have to remain the standard. This raises the problem of assembling bathymetric compilations and utilizing data sets not only with a heterogeneous cover but also with a wide range of accuracies. In combining these data to regularly spaced grids of bathymetric values, which the majority of numerical procedures in earth sciences require, we are often forced to use a complex interpolation scheme due to the sparseness and irregularity of the input data points. Consequently, we are faced with the difficult task of assessing the confidence that we can assign to the final grid product, a task that is not usually addressed in most bathymetric compilations. We approach the problem of assessing the confidence via a direct-simulation Monte Carlo method. We start with a small subset of data from the International Bathymetric Chart of the Arctic Ocean (IBCAO) grid model [Jakobsson et al., 2000]. This grid is compiled from a mixture of data sources ranging from single beam soundings with available metadata to spot soundings with no available metadata, to digitized contours; the test dataset shows examples of all of these types. From this database, we assign a priori error variances based on available meta-data, and when this is not available, based on a worst-case scenario in an essentially heuristic manner. We then generate a number of synthetic datasets by randomly perturbing the base data using normally distributed random variates, scaled according to the predicted error model. These datasets are then re-gridded using the same methodology as the original product, generating a set of plausible grid models of the regional bathymetry that we can use for standard error estimates. Finally, we repeat the entire random estimation process and analyze each run's standard error grids in order to examine sampling bias and variance in the predictions. The final products of the estimation are a collection of standard error grids, which we combine with the source data density in order to create a grid that contains information about the bathymetry model's reliability. Jakobsson, M., Cherkis, N., Woodward, J., Coakley, B., and Macnab, R., 2000, A new grid of Arctic bathymetry: A significant resource for scientists and mapmakers, EOS Transactions, American Geophysical Union, v. 81, no. 9, p. 89, 93, 96.
Wang, Yunyun; Liu, Ye; Deng, Xinli; Cong, Yulong; Jiang, Xingyu
2016-12-15
Although conventional enzyme-linked immunosorbent assays (ELISA) and related assays have been widely applied for the diagnosis of diseases, many of them suffer from large error variance for monitoring the concentration of targets over time, and insufficient limit of detection (LOD) for assaying dilute targets. We herein report a readout mode of ELISA based on the binding between peptidic β-sheet structure and Congo Red. The formation of peptidic β-sheet structure is triggered by alkaline phosphatase (ALP). For the detection of P-Selectin which is a crucial indicator for evaluating thrombus diseases in clinic, the 'β-sheet and Congo Red' mode significantly decreases both the error variance and the LOD (from 9.7ng/ml to 1.1 ng/ml) of detection, compared with commercial ELISA (an existing gold-standard method for detecting P-Selectin in clinic). Considering the wide range of ALP-based antibodies for immunoassays, such novel method could be applicable to the analysis of many types of targets. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Bazzazi, Abbas Aghajani; Esmaeili, Mohammad
2012-12-01
Adaptive neuro-fuzzy inference system (ANFIS) is powerful model in solving complex problems. Since ANFIS has the potential of solving nonlinear problem and can easily achieve the input-output mapping, it is perfect to be used for solving the predicting problem. Backbreak is one of the undesirable effects of blasting operations causing instability in mine walls, falling down the machinery, improper fragmentation and reduction in efficiency of drilling. In this paper, ANFIS was applied to predict backbreak in Sangan iron mine of Iran. The performance of the model was assessed through the root mean squared error (RMSE), the variance account for (VAF) and the correlation coefficient (
The variance needed to accurately describe jump height from vertical ground reaction force data.
Richter, Chris; McGuinness, Kevin; O'Connor, Noel E; Moran, Kieran
2014-12-01
In functional principal component analysis (fPCA) a threshold is chosen to define the number of retained principal components, which corresponds to the amount of preserved information. A variety of thresholds have been used in previous studies and the chosen threshold is often not evaluated. The aim of this study is to identify the optimal threshold that preserves the information needed to describe a jump height accurately utilizing vertical ground reaction force (vGRF) curves. To find an optimal threshold, a neural network was used to predict jump height from vGRF curve measures generated using different fPCA thresholds. The findings indicate that a threshold from 99% to 99.9% (6-11 principal components) is optimal for describing jump height, as these thresholds generated significantly lower jump height prediction errors than other thresholds.
Willem W.S. van Hees
2002-01-01
Comparisons of estimated standard error for a ratio-of-means (ROM) estimator are presented for forest resource inventories conducted in southeast Alaska between 1995 and 2000. Estimated standard errors for the ROM were generated by using a traditional variance estimator and also approximated by bootstrap methods. Estimates of standard error generated by both...
Taking the Error Term of the Factor Model into Account: The Factor Score Predictor Interval
ERIC Educational Resources Information Center
Beauducel, Andre
2013-01-01
The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…
On the Fallibility of Principal Components in Research
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong
2017-01-01
The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…
Estimating riparian understory vegetation cover with beta regression and copula models
Eskelson, Bianca N.I.; Madsen, Lisa; Hagar, Joan C.; Temesgen, Hailemariam
2011-01-01
Understory vegetation communities are critical components of forest ecosystems. As a result, the importance of modeling understory vegetation characteristics in forested landscapes has become more apparent. Abundance measures such as shrub cover are bounded between 0 and 1, exhibit heteroscedastic error variance, and are often subject to spatial dependence. These distributional features tend to be ignored when shrub cover data are analyzed. The beta distribution has been used successfully to describe the frequency distribution of vegetation cover. Beta regression models ignoring spatial dependence (BR) and accounting for spatial dependence (BRdep) were used to estimate percent shrub cover as a function of topographic conditions and overstory vegetation structure in riparian zones in western Oregon. The BR models showed poor explanatory power (pseudo-R2 ≤ 0.34) but outperformed ordinary least-squares (OLS) and generalized least-squares (GLS) regression models with logit-transformed response in terms of mean square prediction error and absolute bias. We introduce a copula (COP) model that is based on the beta distribution and accounts for spatial dependence. A simulation study was designed to illustrate the effects of incorrectly assuming normality, equal variance, and spatial independence. It showed that BR, BRdep, and COP models provide unbiased parameter estimates, whereas OLS and GLS models result in slightly biased estimates for two of the three parameters. On the basis of the simulation study, 93–97% of the GLS, BRdep, and COP confidence intervals covered the true parameters, whereas OLS and BR only resulted in 84–88% coverage, which demonstrated the superiority of GLS, BRdep, and COP over OLS and BR models in providing standard errors for the parameter estimates in the presence of spatial dependence.
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, A.; Künsch, H. R.; Schwierz, C.; Stahel, W. A.
2012-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outlying observations may results from errors (e.g. in data transcription) or from local perturbations in the processes that are responsible for a given pattern of spatial variation. As an example, the spatial distribution of some trace metal in the soils of a region may be distorted by emissions of local anthropogenic sources. Outliers affect the modelling of the large-scale spatial variation, the so-called external drift or trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) [2] proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) [1] for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation. Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and unsampled locations and kriging variances. The method has been implemented in an R package. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis of the Tarrawarra soil moisture data set [3].
Generating highly accurate prediction hypotheses through collaborative ensemble learning
NASA Astrophysics Data System (ADS)
Arsov, Nino; Pavlovski, Martin; Basnarkov, Lasko; Kocarev, Ljupco
2017-03-01
Ensemble generation is a natural and convenient way of achieving better generalization performance of learning algorithms by gathering their predictive capabilities. Here, we nurture the idea of ensemble-based learning by combining bagging and boosting for the purpose of binary classification. Since the former improves stability through variance reduction, while the latter ameliorates overfitting, the outcome of a multi-model that combines both strives toward a comprehensive net-balancing of the bias-variance trade-off. To further improve this, we alter the bagged-boosting scheme by introducing collaboration between the multi-model’s constituent learners at various levels. This novel stability-guided classification scheme is delivered in two flavours: during or after the boosting process. Applied among a crowd of Gentle Boost ensembles, the ability of the two suggested algorithms to generalize is inspected by comparing them against Subbagging and Gentle Boost on various real-world datasets. In both cases, our models obtained a 40% generalization error decrease. But their true ability to capture details in data was revealed through their application for protein detection in texture analysis of gel electrophoresis images. They achieve improved performance of approximately 0.9773 AUROC when compared to the AUROC of 0.9574 obtained by an SVM based on recursive feature elimination.
Comparison of Grouping Schemes for Exposure to Total Dust in Cement Factories in Korea.
Koh, Dong-Hee; Kim, Tae-Woo; Jang, Seung Hee; Ryu, Hyang-Woo; Park, Donguk
2015-08-01
The purpose of this study was to evaluate grouping schemes for exposure to total dust in cement industry workers using non-repeated measurement data. In total, 2370 total dust measurements taken from nine Portland cement factories in 1995-2009 were analyzed. Various grouping schemes were generated based on work process, job, factory, or average exposure. To characterize variance components of each grouping scheme, we developed mixed-effects models with a B-spline time trend incorporated as fixed effects and a grouping variable incorporated as a random effect. Using the estimated variance components, elasticity was calculated. To compare the prediction performances of different grouping schemes, 10-fold cross-validation tests were conducted, and root mean squared errors and pooled correlation coefficients were calculated for each grouping scheme. The five exposure groups created a posteriori by ranking job and factory combinations according to average dust exposure showed the best prediction performance and highest elasticity among various grouping schemes. Our findings suggest a grouping method based on ranking of job, and factory combinations would be the optimal choice in this population. Our grouping method may aid exposure assessment efforts in similar occupational settings, minimizing the misclassification of exposures. © The Author 2015. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Importance of Geosat orbit and tidal errors in the estimation of large-scale Indian Ocean variations
NASA Technical Reports Server (NTRS)
Perigaud, Claire; Zlotnicki, Victor
1992-01-01
To improve the estimate accuracy of large-scale meridional sea-level variations, Geosat ERM data on the Indian Ocean for a 26-month period were processed using two different techniques of orbit error reduction. The first technique removes an along-track polynomial of degree 1 over about 5000 km and the second technique removes an along-track once-per-revolution sine wave about 40,000 km. Results obtained show that the polynomial technique produces stronger attenuation of both the tidal error and the large-scale oceanic signal. After filtering, the residual difference between the two methods represents 44 percent of the total variance and 23 percent of the annual variance. The sine-wave method yields a larger estimate of annual and interannual meridional variations.
Comparative test on several forms of background error covariance in 3DVar
NASA Astrophysics Data System (ADS)
Shao, Aimei
2013-04-01
The background error covariance matrix (Hereinafter referred to as B matrix) plays an important role in the three-dimensional variational (3DVar) data assimilation method. However, it is difficult to get B matrix accurately because true atmospheric state is unknown. Therefore, some methods were developed to estimate B matrix (e.g. NMC method, innovation analysis method, recursive filters, and ensemble method such as EnKF). Prior to further development and application of these methods, the function of several B matrixes estimated by these methods in 3Dvar is worth studying and evaluating. For this reason, NCEP reanalysis data and forecast data are used to test the effectiveness of the several B matrixes with VAF (Huang, 1999) method. Here the NCEP analysis is treated as the truth and in this case the forecast error is known. The data from 2006 to 2007 is used as the samples to estimate B matrix and the data in 2008 is used to verify the assimilation effects. The 48h and 24h forecast valid at the same time is used to estimate B matrix with NMC method. B matrix can be represented by a correlation part (a non-diagonal matrix) and a variance part (a diagonal matrix of variances). Gaussian filter function as an approximate approach is used to represent the variation of correlation coefficients with distance in numerous 3DVar systems. On the basis of the assumption, the following several forms of B matrixes are designed and test with VAF in the comparative experiments: (1) error variance and the characteristic lengths are fixed and setted to their mean value averaged over the analysis domain; (2) similar to (1), but the mean characteristic lengths reduce to 50 percent for the height and 60 percent for the temperature of the original; (3) similar to (2), but error variance calculated directly by the historical data is space-dependent; (4) error variance and characteristic lengths are all calculated directly by the historical data; (5) B matrix is estimated directly by the historical data; (6) similar to (5), but a localization process is performed; (7) B matrix is estimated by NMC method but error variance is reduced by 1.7 times in order that the value is close to that calculated from the true forecast error samples; (8) similar to (7), but the localization similar to (6) is performed. Experimental results with the different B matrixes show that for the Gaussian-type B matrix the characteristic lengths calculated from the true error samples don't bring a good analysis results. However, the reduced characteristic lengths (about half of the original one) can lead to a good analysis. If the B matrix estimated directly from the historical data is used in 3DVar, the assimilation effect can not reach to the best. The better assimilation results are generated with the application of reduced characteristic length and localization. Even so, it hasn't obvious advantage compared with Gaussian-type B matrix with the optimal characteristic length. It implies that the Gaussian-type B matrix, widely used for operational 3DVar system, can get a good analysis with the appropriate characteristic lengths. The crucial problem is how to determine the appropriate characteristic lengths. (This work is supported by the National Natural Science Foundation of China (41275102, 40875063), and the Fundamental Research Funds for the Central Universities (lzujbky-2010-9) )
ERIC Educational Resources Information Center
Oranje, Andreas
2006-01-01
A multitude of methods has been proposed to estimate the sampling variance of ratio estimates in complex samples (Wolter, 1985). Hansen and Tepping (1985) studied some of those variance estimators and found that a high coefficient of variation (CV) of the denominator of a ratio estimate is indicative of a biased estimate of the standard error of a…
Ockhuijsen, Henrietta D L; van Smeden, Maarten; van den Hoogen, Agnes; Boivin, Jacky
2017-06-01
To examine construct and criterion validity of the Dutch SCREENIVF among women and men undergoing a fertility treatment. A prospective longitudinal study nested in a randomized controlled trial. University hospital. Couples, 468 women and 383 men, undergoing an IVF/intracytoplasmic sperm injection (ICSI) treatment in a fertility clinic, completed the SCREENIVF. Construct and criteria validity of the SCREENIVF. The comparative fit index and root mean square error of approximation for women and men show a good fit of the factor model. Across time, the sensitivity for Hospital Anxiety and Depression Scale subscale in women ranged from 61%-98%, specificity 53%-65%, predictive value of a positive test (PVP) 13%-56%, predictive value of a negative test (PVN) 70%-99%. The sensitivity scores for men ranged from 38%-100%, specificity 71%-75%, PVP 9%-27%, PVN 92%-100%. A prediction model revealed that for women 68.7% of the variance in the Hospital Anxiety and Depression Scale on time 1 and 42.5% at time 2 and 38.9% at time 3 was explained by the predictors, the sum score scales of the SCREENIVF. For men, 58.1% of the variance in the Hospital Anxiety and Depression Scale on time 1 and 46.5% at time 2 and 37.3% at time 3 was explained by the predictors, the sum score scales of the SCREENIVF. The SCREENIVF has good construct validity but the concurrent validity is better than the predictive validity. SCREENIVF will be most effectively used in fertility clinics at the start of treatment and should not be used as a predictive tool. Copyright © 2017 American Society for Reproductive Medicine. All rights reserved.
Accuracy of genomic predictions in Gyr (Bos indicus) dairy cattle.
Boison, S A; Utsunomiya, A T H; Santos, D J A; Neves, H H R; Carvalheiro, R; Mészáros, G; Utsunomiya, Y T; do Carmo, A S; Verneque, R S; Machado, M A; Panetto, J C C; Garcia, J F; Sölkner, J; da Silva, M V G B
2017-07-01
Genomic selection may accelerate genetic progress in breeding programs of indicine breeds when compared with traditional selection methods. We present results of genomic predictions in Gyr (Bos indicus) dairy cattle of Brazil for milk yield (MY), fat yield (FY), protein yield (PY), and age at first calving using information from bulls and cows. Four different single nucleotide polymorphism (SNP) chips were studied. Additionally, the effect of the use of imputed data on genomic prediction accuracy was studied. A total of 474 bulls and 1,688 cows were genotyped with the Illumina BovineHD (HD; San Diego, CA) and BovineSNP50 (50K) chip, respectively. Genotypes of cows were imputed to HD using FImpute v2.2. After quality check of data, 496,606 markers remained. The HD markers present on the GeneSeek SGGP-20Ki (15,727; Lincoln, NE), 50K (22,152), and GeneSeek GGP-75Ki (65,018) were subset and used to assess the effect of lower SNP density on accuracy of prediction. Deregressed breeding values were used as pseudophenotypes for model training. Data were split into reference and validation to mimic a forward prediction scheme. The reference population consisted of animals whose birth year was ≤2004 and consisted of either only bulls (TR1) or a combination of bulls and dams (TR2), whereas the validation set consisted of younger bulls (born after 2004). Genomic BLUP was used to estimate genomic breeding values (GEBV) and reliability of GEBV (R 2 PEV ) was based on the prediction error variance approach. Reliability of GEBV ranged from ∼0.46 (FY and PY) to 0.56 (MY) with TR1 and from 0.51 (PY) to 0.65 (MY) with TR2. When averaged across all traits, R 2 PEV were substantially higher (R 2 PEV of TR1 = 0.50 and TR2 = 0.57) compared with reliabilities of parent averages (0.35) computed from pedigree data and based on diagonals of the coefficient matrix (prediction error variance approach). Reliability was similar for all the 4 marker panels using either TR1 or TR2, except that imputed HD cow data set led to an inflation of reliability. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. A reduced panel of ∼15K markers resulted in reliabilities similar to using HD markers. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
He, Jianbo; Li, Jijie; Huang, Zhongwen; Zhao, Tuanjie; Xing, Guangnan; Gai, Junyi; Guan, Rongzhan
2015-01-01
Experimental error control is very important in quantitative trait locus (QTL) mapping. Although numerous statistical methods have been developed for QTL mapping, a QTL detection model based on an appropriate experimental design that emphasizes error control has not been developed. Lattice design is very suitable for experiments with large sample sizes, which is usually required for accurate mapping of quantitative traits. However, the lack of a QTL mapping method based on lattice design dictates that the arithmetic mean or adjusted mean of each line of observations in the lattice design had to be used as a response variable, resulting in low QTL detection power. As an improvement, we developed a QTL mapping method termed composite interval mapping based on lattice design (CIMLD). In the lattice design, experimental errors are decomposed into random errors and block-within-replication errors. Four levels of block-within-replication errors were simulated to show the power of QTL detection under different error controls. The simulation results showed that the arithmetic mean method, which is equivalent to a method under random complete block design (RCBD), was very sensitive to the size of the block variance and with the increase of block variance, the power of QTL detection decreased from 51.3% to 9.4%. In contrast to the RCBD method, the power of CIMLD and the adjusted mean method did not change for different block variances. The CIMLD method showed 1.2- to 7.6-fold higher power of QTL detection than the arithmetic or adjusted mean methods. Our proposed method was applied to real soybean (Glycine max) data as an example and 10 QTLs for biomass were identified that explained 65.87% of the phenotypic variation, while only three and two QTLs were identified by arithmetic and adjusted mean methods, respectively.
Heavner, Karyn; Burstyn, Igor
2015-08-24
Variation in the odds ratio (OR) resulting from selection of cutoffs for categorizing continuous variables is rarely discussed. We present results for the effect of varying cutoffs used to categorize a mismeasured exposure in a simulated population in the context of autism spectrum disorders research. Simulated cohorts were created with three distinct exposure-outcome curves and three measurement error variances for the exposure. ORs were calculated using logistic regression for 61 cutoffs (mean ± 3 standard deviations) used to dichotomize the observed exposure. ORs were calculated for five categories with a wide range for the cutoffs. For each scenario and cutoff, the OR, sensitivity, and specificity were calculated. The three exposure-outcome relationships had distinctly shaped OR (versus cutoff) curves, but increasing measurement error obscured the shape. At extreme cutoffs, there was non-monotonic oscillation in the ORs that cannot be attributed to "small numbers." Exposure misclassification following categorization of the mismeasured exposure was differential, as predicted by theory. Sensitivity was higher among cases and specificity among controls. Cutoffs chosen for categorizing continuous variables can have profound effects on study results. When measurement error is not too great, the shape of the OR curve may provide insight into the true shape of the exposure-disease relationship.
Saviane, Chiara; Silver, R Angus
2006-06-15
Synapses play a crucial role in information processing in the brain. Amplitude fluctuations of synaptic responses can be used to extract information about the mechanisms underlying synaptic transmission and its modulation. In particular, multiple-probability fluctuation analysis can be used to estimate the number of functional release sites, the mean probability of release and the amplitude of the mean quantal response from fits of the relationship between the variance and mean amplitude of postsynaptic responses, recorded at different probabilities. To determine these quantal parameters, calculate their uncertainties and the goodness-of-fit of the model, it is important to weight the contribution of each data point in the fitting procedure. We therefore investigated the errors associated with measuring the variance by determining the best estimators of the variance of the variance and have used simulations of synaptic transmission to test their accuracy and reliability under different experimental conditions. For central synapses, which generally have a low number of release sites, the amplitude distribution of synaptic responses is not normal, thus the use of a theoretical variance of the variance based on the normal assumption is not a good approximation. However, appropriate estimators can be derived for the population and for limited sample sizes using a more general expression that involves higher moments and introducing unbiased estimators based on the h-statistics. Our results are likely to be relevant for various applications of fluctuation analysis when few channels or release sites are present.
Repeatability and reproducibility of ribotyping and its computer interpretation.
Lefresne, Gwénola; Latrille, Eric; Irlinger, Françoise; Grimont, Patrick A D
2004-04-01
Many molecular typing methods are difficult to interpret because their repeatability (within-laboratory variance) and reproducibility (between-laboratory variance) have not been thoroughly studied. In the present work, ribotyping of coryneform bacteria was the basis of a study involving within-gel and between-gel repeatability and between-laboratory reproducibility (two laboratories involved). The effect of different technical protocols, different algorithms, and different software for fragment size determination was studied. Analysis of variance (ANOVA) showed, within a laboratory, that there was no significant added variance between gels. However, between-laboratory variance was significantly higher than within-laboratory variance. This may be due to the use of different protocols. An experimental function was calculated to transform the data and make them compatible (i.e., erase the between-laboratory variance). The use of different interpolation algorithms (spline, Schaffer and Sederoff) was a significant source of variation in one laboratory only. The use of either Taxotron (Institut Pasteur) or GelCompar (Applied Maths) was not a significant source of added variation when the same algorithm (spline) was used. However, the use of Bio-Gene (Vilber Lourmat) dramatically increased the error (within laboratory, within gel) in one laboratory, while decreasing the error in the other laboratory; this might be due to automatic normalization attempts. These results were taken into account for building a database and performing automatic pattern identification using Taxotron. Conversion of the data considerably improved the identification of patterns irrespective of the laboratory in which the data were obtained.
Evaluation of assumptions in soil moisture triple collocation analysis
USDA-ARS?s Scientific Manuscript database
Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually-independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and ort...
NASA Astrophysics Data System (ADS)
Roberts, A.; Bench, K.; Maslowski, W.; Farrell, S. L.; Richter-Menge, J.
2016-12-01
We have developed a method to quantitatively assess the skill of predictive sea ice models using freeboard measurements from spaceborne laser altimeters. The method evaluates freeboard from the Regional Arctic System Model (RASM) against those derived from NASA ICESat and Operation IceBridge (OIB) missions along individual ground tracks, and assesses the variance- and correlation-weighted model skill. This allows quantifying the accuracy of sea ice volume simulations and taking measurement error into account. As part of this work, we inter-compare simulations with two different sea ice rheologies: one using Elastic-Viscous-Plastic (EVP), and the other using Elastic-Anisotropic-Plastic (EAP) ice mechanics. Both are simulated for 2004 and 2007, during which ICESat was in operation. RASM variance skill scores ranged from 0.712 to 0.824 and correlation skill scores were between 0.319 and 0.511, with EAP providing a better estimate of spatial ice volume variance, but with a larger bias in the central Arctic relative to EVP. The skill scores were calculated for monthly periods and require little adaption to rate short-term operational forecasts of the Arctic. This work will help quantify model limitations and facilitate optimal use of ICESat-2 freeboard measurements after that satellite is launched next year.
NASA Astrophysics Data System (ADS)
Codis, Sandrine; Bernardeau, Francis; Pichon, Christophe
2016-08-01
In order to quantify the error budget in the measured probability distribution functions of cell densities, the two-point statistics of cosmic densities in concentric spheres is investigated. Bias functions are introduced as the ratio of their two-point correlation function to the two-point correlation of the underlying dark matter distribution. They describe how cell densities are spatially correlated. They are computed here via the so-called large deviation principle in the quasi-linear regime. Their large-separation limit is presented and successfully compared to simulations for density and density slopes: this regime is shown to be rapidly reached allowing to get sub-percent precision for a wide range of densities and variances. The corresponding asymptotic limit provides an estimate of the cosmic variance of standard concentric cell statistics applied to finite surveys. More generally, no assumption on the separation is required for some specific moments of the two-point statistics, for instance when predicting the generating function of cumulants containing any powers of concentric densities in one location and one power of density at some arbitrary distance from the rest. This exact `one external leg' cumulant generating function is used in particular to probe the rate of convergence of the large-separation approximation.
The effect of interacting dark energy on local measurements of the Hubble constant
DOE Office of Scientific and Technical Information (OSTI.GOV)
Odderskov, Io; Baldi, Marco; Amendola, Luca, E-mail: isho07@phys.au.dk, E-mail: marco.baldi5@unibo.it, E-mail: l.amendola@thphys.uni-heidelberg.de
2016-05-01
In the current state of cosmology, where cosmological parameters are being measured to percent accuracy, it is essential to understand all sources of error to high precision. In this paper we present the results of a study of the local variations in the Hubble constant measured at the distance scale of the Coma Cluster, and test the validity of correcting for the peculiar velocities predicted by gravitational instability theory. The study is based on N-body simulations, and includes models featuring a coupling between dark energy and dark matter, as well as two ΛCDM simulations with different values of σ{sub 8}.more » It is found that the variance in the local flows is significantly larger in the coupled models, which increases the uncertainty in the local measurements of the Hubble constant in these scenarios. By comparing the results from the different simulations, it is found that most of the effect is caused by the higher value of σ{sub 8} in the coupled cosmologies, though this cannot account for all of the additional variance. Given the discrepancy between different estimates of the Hubble constant in the universe today, cosmological models causing a greater cosmic variance is something that we should be aware of.« less
The Importance of Relying on the Manual: Scoring Error Variance in the WISC-IV Vocabulary Subtest
ERIC Educational Resources Information Center
Erdodi, Laszlo A.; Richard, David C. S.; Hopwood, Christopher
2009-01-01
Classical test theory assumes that ability level has no effect on measurement error. Newer test theories, however, argue that the precision of a measurement instrument changes as a function of the examinee's true score. Research has shown that administration errors are common in the Wechsler scales and that subtests requiring subjective scoring…
Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS)
NASA Astrophysics Data System (ADS)
Zhang, Yun; He, Yong
2006-09-01
The traditional uniform herbicide application often results in an over chemical residues on soil, crop plants and agriculture produce, which have imperiled the environment and food security. Near-infrared reflectance spectroscopy (NIRS) offers a promising means for weed detection and site-specific herbicide application. In laboratory, a total of 90 samples (30 for each species) of the detached leaves of two weeds, i.e., threeseeded mercury (Acalypha australis L.) and fourleafed duckweed (Marsilea quadrfolia L.), and one crop soybean (Glycine max) was investigated for NIRS on 325- 1075 nm using a field spectroradiometer. 20 absorbance samples of each species after pretreatment were exported and the lacked Y variables were assigned independent values for partial least squares (PLS) analysis. During the combined principle component analysis (PCA) on 400-1000 nm, the PC1 and PC2 could together explain over 91% of the total variance and detect the three plant species with 98.3% accuracy. The full-cross validation results of PLS, i.e., standard error of prediction (SEP) 0.247, correlation coefficient (r) 0.954 and root mean square error of prediction (RMSEP) 0.245, indicated an optimum model for weed identification. By predicting the remaining 10 samples of each species in the PLS model, the results with deviation presented a 100% crop/weed detection rate. Thus, it could be concluded that PLS was an available alternative of for qualitative weed discrimination on NTRS.
Mixed model approaches for diallel analysis based on a bio-model.
Zhu, J; Weir, B S
1996-12-01
A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.
Integrating mean and variance heterogeneities to identify differentially expressed genes.
Ouyang, Weiwei; An, Qiang; Zhao, Jinying; Qin, Huaizhen
2016-12-06
In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes. In this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment-wide significant MVDE genes. Our results indicate tremendous potential gain of integrating informative variance heterogeneity after adjusting for global confounders and background data structure. The proposed informative integration test better summarizes the impacts of condition change on expression distributions of susceptible genes than do the existent competitors. Therefore, particular attention should be paid to explicitly exploit the variance heterogeneity induced by condition change in functional genomics analysis.
Murphy, Alistair P; Duffield, Rob; Kellett, Aaron; Reid, Machar
2014-09-01
To investigate the discrepancy between coach and athlete perceptions of internal load and notational analysis of external load in elite junior tennis. Fourteen elite junior tennis players and 6 international coaches were recruited. Ratings of perceived exertion (RPEs) were recorded for individual drills and whole sessions, along with a rating of mental exertion, coach rating of intended session exertion, and athlete heart rate (HR). Furthermore, total stroke count and unforced-error count were notated using video coding after each session, alongside coach and athlete estimations of shots and errors made. Finally, regression analyses explained the variance in the criterion variables of athlete and coach RPE. Repeated-measures analyses of variance and interclass correlation coefficients revealed that coaches significantly (P < .01) underestimated athlete session RPE, with only moderate correlation (r = .59) demonstrated between coach and athlete. However, athlete drill RPE (P = .14; r = .71) and mental exertion (P = .44; r = .68) were comparable and substantially correlated. No significant differences in estimated stroke count were evident between athlete and coach (P = .21), athlete notational analysis (P = .06), or coach notational analysis (P = .49). Coaches estimated significantly greater unforced errors than either athletes or notational analysis (P < .01). Regression analyses found that 54.5% of variance in coach RPE was explained by intended session exertion and coach drill RPE, while drill RPE and peak HR explained 45.3% of the variance in athlete session RPE. Coaches misinterpreted session RPE but not drill RPE, while inaccurately monitoring error counts. Improved understanding of external- and internal-load monitoring may help coach-athlete relationships in individual sports like tennis avoid maladaptive training.
Zhang, Yun-jian; Li, Qiang; Zhang, Yu-xiu; Wang, Dan; Xing, Jian-min
2012-01-01
Succinic acid is considered as an important platform chemical. Succinic acid fermentation with Actinobacillus succinogenes strain BE-1 was optimized by central composite design (CCD) using a response surface methodology (RSM). The optimized production of succinic acid was predicted and the interactive effects between glucose, yeast extract, and magnesium carbonate were investigated. As a result, a model for predicting the concentration of succinic acid production was developed. The accuracy of the model was confirmed by the analysis of variance (ANOVA), and the validity was further proved by verification experiments showing that percentage errors between actual and predicted values varied from 3.02% to 6.38%. In addition, it was observed that the interactive effect between yeast extract and magnesium carbonate was statistically significant. In conclusion, RSM is an effective and useful method for optimizing the medium components and investigating the interactive effects, and can provide valuable information for succinic acid scale-up fermentation using A. succinogenes strain BE-1. PMID:22302423
Xu, Hang; Merryweather, Andrew; Bloswick, Donald; Mao, Qi; Wang, Tong
2015-01-01
Marker placement can be a significant source of error in biomechanical studies of human movement. The toe marker placement error is amplified by footwear since the toe marker placement on the shoe only relies on an approximation of underlying anatomical landmarks. Three total knee replacement subjects were recruited and three self-speed gait trials per subject were collected. The height variation between toe and heel markers of four types of footwear was evaluated from the results of joint kinematics and muscle forces using OpenSim. The reference condition was considered as the same vertical height of toe and heel markers. The results showed that the residual variances for joint kinematics had an approximately linear relationship with toe marker placement error for lower limb joints. Ankle dorsiflexion/plantarflexion is most sensitive to toe marker placement error. The influence of toe marker placement error is generally larger for hip flexion/extension and rotation than hip abduction/adduction and knee flexion/extension. The muscle forces responded to the residual variance of joint kinematics to various degrees based on the muscle function for specific joint kinematics. This study demonstrates the importance of evaluating marker error for joint kinematics and muscle forces when explaining relative clinical gait analysis and treatment intervention.
Linear error analysis of slope-area discharge determinations
Kirby, W.H.
1987-01-01
The slope-area method can be used to calculate peak flood discharges when current-meter measurements are not possible. This calculation depends on several quantities, such as water-surface fall, that are subject to large measurement errors. Other critical quantities, such as Manning's n, are not even amenable to direct measurement but can only be estimated. Finally, scour and fill may cause gross discrepancies between the observed condition of the channel and the hydraulic conditions during the flood peak. The effects of these potential errors on the accuracy of the computed discharge have been estimated by statistical error analysis using a Taylor-series approximation of the discharge formula and the well-known formula for the variance of a sum of correlated random variates. The resultant error variance of the computed discharge is a weighted sum of covariances of the various observational errors. The weights depend on the hydraulic and geometric configuration of the channel. The mathematical analysis confirms the rule of thumb that relative errors in computed discharge increase rapidly when velocity heads exceed the water-surface fall, when the flow field is expanding and when lateral velocity variation (alpha) is large. It also confirms the extreme importance of accurately assessing the presence of scour or fill. ?? 1987.
Tabachnick, W J; Mecham, J O
1991-03-01
An enzyme-linked immunoassay for detecting bluetongue virus in infected Culicoides variipennis was evaluated using a nested analysis of variance to determine sources of experimental error in the procedure. The major source of variation was differences among individual insects (84% of the total variance). Storing insects at -70 degrees C for two months contributed to experimental variation in the ELISA reading (14% of the total variance) and should be avoided. Replicate assays of individual insects were shown to be unnecessary, since variation among replicate wells and plates was minor (2% of the total variance).
Triple collocation based merging of satellite soil moisture retrievals
USDA-ARS?s Scientific Manuscript database
We propose a method for merging soil moisture retrievals from space borne active and passive microwave instruments based on weighted averaging taking into account the error characteristics of the individual data sets. The merging scheme is parameterized using error variance estimates obtained from u...
Preference uncertainty, preference learning, and paired comparison experiments
David C. Kingsley; Thomas C. Brown
2010-01-01
Results from paired comparison experiments suggest that as respondents progress through a sequence of binary choices they become more consistent, apparently fine-tuning their preferences. Consistency may be indicated by the variance of the estimated valuation distribution measured by the error term in the random utility model. A significant reduction in the variance is...
Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.
2015-01-01
Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.
Hayiou-Thomas, Marianna E; Carroll, Julia M; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J
2017-02-01
This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were assessed at the start of formal reading instruction (age 5½), using measures of phoneme awareness, word-level reading and spelling; and 3 years later (age 8), using measures of word-level reading, spelling and reading comprehension. The presence of early SSD conferred a small but significant risk of poor phonemic skills and spelling at the age of 5½ and of poor word reading at the age of 8. Furthermore, within the group with SSD, the persistence of speech difficulties to the point of school entry was associated with poorer emergent literacy skills, and children with 'disordered' speech errors had poorer word reading skills than children whose speech errors indicated 'delay'. In contrast, the initial severity of SSD was not a significant predictor of reading development. Beyond the domain of speech, the presence of a co-occurring language impairment was strongly predictive of literacy skills and having a family risk of dyslexia predicted additional variance in literacy at both time-points. Early SSD alone has only modest effects on literacy development but when additional risk factors are present, these can have serious negative consequences, consistent with the view that multiple risks accumulate to predict reading disorders. © 2016 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.
Two proposed convergence criteria for Monte Carlo solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forster, R.A.; Pederson, S.P.; Booth, T.E.
1992-01-01
The central limit theorem (CLT) can be applied to a Monte Carlo solution if two requirements are satisfied: (1) The random variable has a finite mean and a finite variance; and (2) the number N of independent observations grows large. When these two conditions are satisfied, a confidence interval (CI) based on the normal distribution with a specified coverage probability can be formed. The first requirement is generally satisfied by the knowledge of the Monte Carlo tally being used. The Monte Carlo practitioner has a limited number of marginal methods to assess the fulfillment of the second requirement, such asmore » statistical error reduction proportional to 1/[radical]N with error magnitude guidelines. Two proposed methods are discussed in this paper to assist in deciding if N is large enough: estimating the relative variance of the variance (VOV) and examining the empirical history score probability density function (pdf).« less
NASA Technical Reports Server (NTRS)
Li, Rongsheng (Inventor); Kurland, Jeffrey A. (Inventor); Dawson, Alec M. (Inventor); Wu, Yeong-Wei A. (Inventor); Uetrecht, David S. (Inventor)
2004-01-01
Methods and structures are provided that enhance attitude control during gyroscope substitutions by insuring that a spacecraft's attitude control system does not drive its absolute-attitude sensors out of their capture ranges. In a method embodiment, an operational process-noise covariance Q of a Kalman filter is temporarily replaced with a substantially greater interim process-noise covariance Q. This replacement increases the weight given to the most recent attitude measurements and hastens the reduction of attitude errors and gyroscope bias errors. The error effect of the substituted gyroscopes is reduced and the absolute-attitude sensors are not driven out of their capture range. In another method embodiment, this replacement is preceded by the temporary replacement of an operational measurement-noise variance R with a substantially larger interim measurement-noise variance R to reduce transients during the gyroscope substitutions.
Diallel analysis for sex-linked and maternal effects.
Zhu, J; Weir, B S
1996-01-01
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.
Harris, Peter R; Sillence, Elizabeth; Briggs, Pam
2011-07-27
How do people decide which sites to use when seeking health advice online? We can assume, from related work in e-commerce, that general design factors known to affect trust in the site are important, but in this paper we also address the impact of factors specific to the health domain. The current study aimed to (1) assess the factorial structure of a general measure of Web trust, (2) model how the resultant factors predicted trust in, and readiness to act on, the advice found on health-related websites, and (3) test whether adding variables from social cognition models to capture elements of the response to threatening, online health-risk information enhanced the prediction of these outcomes. Participants were asked to recall a site they had used to search for health-related information and to think of that site when answering an online questionnaire. The questionnaire consisted of a general Web trust questionnaire plus items assessing appraisals of the site, including threat appraisals, information checking, and corroboration. It was promoted on the hungersite.com website. The URL was distributed via Yahoo and local print media. We assessed the factorial structure of the measures using principal components analysis and modeled how well they predicted the outcome measures using structural equation modeling (SEM) with EQS software. We report an analysis of the responses of participants who searched for health advice for themselves (N = 561). Analysis of the general Web trust questionnaire revealed 4 factors: information quality, personalization, impartiality, and credible design. In the final SEM model, information quality and impartiality were direct predictors of trust. However, variables specific to eHealth (perceived threat, coping, and corroboration) added substantially to the ability of the model to predict variance in trust and readiness to act on advice on the site. The final model achieved a satisfactory fit: χ(2) (5) = 10.8 (P = .21), comparative fit index = .99, root mean square error of approximation = .052. The model accounted for 66% of the variance in trust and 49% of the variance in readiness to act on the advice. Adding variables specific to eHealth enhanced the ability of a model of trust to predict trust and readiness to act on advice.
Harris, Peter R; Briggs, Pam
2011-01-01
Background How do people decide which sites to use when seeking health advice online? We can assume, from related work in e-commerce, that general design factors known to affect trust in the site are important, but in this paper we also address the impact of factors specific to the health domain. Objective The current study aimed to (1) assess the factorial structure of a general measure of Web trust, (2) model how the resultant factors predicted trust in, and readiness to act on, the advice found on health-related websites, and (3) test whether adding variables from social cognition models to capture elements of the response to threatening, online health-risk information enhanced the prediction of these outcomes. Methods Participants were asked to recall a site they had used to search for health-related information and to think of that site when answering an online questionnaire. The questionnaire consisted of a general Web trust questionnaire plus items assessing appraisals of the site, including threat appraisals, information checking, and corroboration. It was promoted on the hungersite.com website. The URL was distributed via Yahoo and local print media. We assessed the factorial structure of the measures using principal components analysis and modeled how well they predicted the outcome measures using structural equation modeling (SEM) with EQS software. Results We report an analysis of the responses of participants who searched for health advice for themselves (N = 561). Analysis of the general Web trust questionnaire revealed 4 factors: information quality, personalization, impartiality, and credible design. In the final SEM model, information quality and impartiality were direct predictors of trust. However, variables specific to eHealth (perceived threat, coping, and corroboration) added substantially to the ability of the model to predict variance in trust and readiness to act on advice on the site. The final model achieved a satisfactory fit: χ2 5 = 10.8 (P = .21), comparative fit index = .99, root mean square error of approximation = .052. The model accounted for 66% of the variance in trust and 49% of the variance in readiness to act on the advice. Conclusions Adding variables specific to eHealth enhanced the ability of a model of trust to predict trust and readiness to act on advice. PMID:21795237
Kofman, Rianne; Beekman, Anna M; Emmelot, Cornelis H; Geertzen, Jan H B; Dijkstra, Pieter U
2018-06-01
Non-contact scanners may have potential for measurement of residual limb volume. Different non-contact scanners have been introduced during the last decades. Reliability and usability (practicality and user friendliness) should be assessed before introducing these systems in clinical practice. The aim of this study was to analyze the measurement properties and usability of four non-contact scanners (TT Design, Omega Scanner, BioSculptor Bioscanner, and Rodin4D Scanner). Quasi experimental. Nine (geometric and residual limb) models were measured on two occasions, each consisting of two sessions, thus in total 4 sessions. In each session, four observers used the four systems for volume measurement. Mean for each model, repeatability coefficients for each system, variance components, and their two-way interactions of measurement conditions were calculated. User satisfaction was evaluated with the Post-Study System Usability Questionnaire. Systematic differences between the systems were found in volume measurements. Most of the variances were explained by the model (97%), while error variance was 3%. Measurement system and the interaction between system and model explained 44% of the error variance. Repeatability coefficient of the systems ranged from 0.101 (Omega Scanner) to 0.131 L (Rodin4D). Differences in Post-Study System Usability Questionnaire scores between the systems were small and not significant. The systems were reliable in determining residual limb volume. Measurement systems and the interaction between system and residual limb model explained most of the error variances. The differences in repeatability coefficient and usability between the four CAD/CAM systems were small. Clinical relevance If accurate measurements of residual limb volume are required (in case of research), modern non-contact scanners should be taken in consideration nowadays.
Noncommuting observables in quantum detection and estimation theory
NASA Technical Reports Server (NTRS)
Helstrom, C. W.
1972-01-01
Basing decisions and estimates on simultaneous approximate measurements of noncommuting observables in a quantum receiver is shown to be equivalent to measuring commuting projection operators on a larger Hilbert space than that of the receiver itself. The quantum-mechanical Cramer-Rao inequalities derived from right logarithmic derivatives and symmetrized logarithmic derivatives of the density operator are compared, and it is shown that the latter give superior lower bounds on the error variances of individual unbiased estimates of arrival time and carrier frequency of a coherent signal. For a suitably weighted sum of the error variances of simultaneous estimates of these, the former yield the superior lower bound under some conditions.
Williams, Larry J; O'Boyle, Ernest H
2015-09-01
A persistent concern in the management and applied psychology literature is the effect of common method variance on observed relations among variables. Recent work (i.e., Richardson, Simmering, & Sturman, 2009) evaluated 3 analytical approaches to controlling for common method variance, including the confirmatory factor analysis (CFA) marker technique. Their findings indicated significant problems with this technique, especially with nonideal marker variables (those with theoretical relations with substantive variables). Based on their simulation results, Richardson et al. concluded that not correcting for method variance provides more accurate estimates than using the CFA marker technique. We reexamined the effects of using marker variables in a simulation study and found the degree of error in estimates of a substantive factor correlation was relatively small in most cases, and much smaller than error associated with making no correction. Further, in instances in which the error was large, the correlations between the marker and substantive scales were higher than that found in organizational research with marker variables. We conclude that in most practical settings, the CFA marker technique yields parameter estimates close to their true values, and the criticisms made by Richardson et al. are overstated. (c) 2015 APA, all rights reserved).
Moss, Marshall E.; Gilroy, Edward J.
1980-01-01
This report describes the theoretical developments and illustrates the applications of techniques that recently have been assembled to analyze the cost-effectiveness of federally funded stream-gaging activities in support of the Colorado River compact and subsequent adjudications. The cost effectiveness of 19 stream gages in terms of minimizing the sum of the variances of the errors of estimation of annual mean discharge is explored by means of a sequential-search optimization scheme. The search is conducted over a set of decision variables that describes the number of times that each gaging route is traveled in a year. A gage route is defined as the most expeditious circuit that is made from a field office to visit one or more stream gages and return to the office. The error variance is defined as a function of the frequency of visits to a gage by using optimal estimation theory. Currently a minimum of 12 visits per year is made to any gage. By changing to a six-visit minimum, the same total error variance can be attained for the 19 stations with a budget of 10% less than the current one. Other strategies are also explored. (USGS)
Interruption Practice Reduces Errors
2014-01-01
dangers of errors at the PCS. Electronic health record systems are used to reduce certain errors related to poor- handwriting and dosage...10.16, MSE =.31, p< .05, η2 = .18 A significant interaction between the number of interruptions and interrupted trials suggests that trials...the variance when calculating whether a memory has a higher signal than interference. If something in addition to activation contributes to goal
Non-linear matter power spectrum covariance matrix errors and cosmological parameter uncertainties
NASA Astrophysics Data System (ADS)
Blot, L.; Corasaniti, P. S.; Amendola, L.; Kitching, T. D.
2016-06-01
The covariance of the matter power spectrum is a key element of the analysis of galaxy clustering data. Independent realizations of observational measurements can be used to sample the covariance, nevertheless statistical sampling errors will propagate into the cosmological parameter inference potentially limiting the capabilities of the upcoming generation of galaxy surveys. The impact of these errors as function of the number of realizations has been previously evaluated for Gaussian distributed data. However, non-linearities in the late-time clustering of matter cause departures from Gaussian statistics. Here, we address the impact of non-Gaussian errors on the sample covariance and precision matrix errors using a large ensemble of N-body simulations. In the range of modes where finite volume effects are negligible (0.1 ≲ k [h Mpc-1] ≲ 1.2), we find deviations of the variance of the sample covariance with respect to Gaussian predictions above ˜10 per cent at k > 0.3 h Mpc-1. Over the entire range these reduce to about ˜5 per cent for the precision matrix. Finally, we perform a Fisher analysis to estimate the effect of covariance errors on the cosmological parameter constraints. In particular, assuming Euclid-like survey characteristics we find that a number of independent realizations larger than 5000 is necessary to reduce the contribution of sampling errors to the cosmological parameter uncertainties at subpercent level. We also show that restricting the analysis to large scales k ≲ 0.2 h Mpc-1 results in a considerable loss in constraining power, while using the linear covariance to include smaller scales leads to an underestimation of the errors on the cosmological parameters.
The development and evaluation of accident predictive models
NASA Astrophysics Data System (ADS)
Maleck, T. L.
1980-12-01
A mathematical model that will predict the incremental change in the dependent variables (accident types) resulting from changes in the independent variables is developed. The end product is a tool for estimating the expected number and type of accidents for a given highway segment. The data segments (accidents) are separated in exclusive groups via a branching process and variance is further reduced using stepwise multiple regression. The standard error of the estimate is calculated for each model. The dependent variables are the frequency, density, and rate of 18 types of accidents among the independent variables are: district, county, highway geometry, land use, type of zone, speed limit, signal code, type of intersection, number of intersection legs, number of turn lanes, left-turn control, all-red interval, average daily traffic, and outlier code. Models for nonintersectional accidents did not fit nor validate as well as models for intersectional accidents.
NASA Technical Reports Server (NTRS)
Colwell, R. N. (Principal Investigator)
1984-01-01
The geometric quality of TM film and digital products is evaluated by making selective photomeasurements and by measuring the coordinates of known features on both the TM products and map products. These paired observations are related using a standard linear least squares regression approach. Using regression equations and coefficients developed from 225 (TM film product) and 20 (TM digital product) control points, map coordinates of test points are predicted. The residual error vectors and analysis of variance (ANOVA) were performed on the east and north residual using nine image segments (blocks) as treatments. Based on the root mean square error of the 223 (TM film product) and 22 (TM digital product) test points, users of TM data expect the planimetric accuracy of mapped points to be within 91 meters and within 117 meters for the film products, and to be within 12 meters and within 14 meters for the digital products.
Bootstrap Estimates of Standard Errors in Generalizability Theory
ERIC Educational Resources Information Center
Tong, Ye; Brennan, Robert L.
2007-01-01
Estimating standard errors of estimated variance components has long been a challenging task in generalizability theory. Researchers have speculated about the potential applicability of the bootstrap for obtaining such estimates, but they have identified problems (especially bias) in using the bootstrap. Using Brennan's bias-correcting procedures…
Analysis of Wind Tunnel Polar Replicates Using the Modern Design of Experiments
NASA Technical Reports Server (NTRS)
Deloach, Richard; Micol, John R.
2010-01-01
The role of variance in a Modern Design of Experiments analysis of wind tunnel data is reviewed, with distinctions made between explained and unexplained variance. The partitioning of unexplained variance into systematic and random components is illustrated, with examples of the elusive systematic component provided for various types of real-world tests. The importance of detecting and defending against systematic unexplained variance in wind tunnel testing is discussed, and the random and systematic components of unexplained variance are examined for a representative wind tunnel data set acquired in a test in which a missile is used as a test article. The adverse impact of correlated (non-independent) experimental errors is described, and recommendations are offered for replication strategies that facilitate the quantification of random and systematic unexplained variance.
Estimating the Magnitude and Frequency of Floods in Small Urban Streams in South Carolina, 2001
Feaster, Toby D.; Guimaraes, Wladimir B.
2004-01-01
The magnitude and frequency of floods at 20 streamflowgaging stations on small, unregulated urban streams in or near South Carolina were estimated by fitting the measured wateryear peak flows to a log-Pearson Type-III distribution. The period of record (through September 30, 2001) for the measured water-year peak flows ranged from 11 to 25 years with a mean and median length of 16 years. The drainage areas of the streamflow-gaging stations ranged from 0.18 to 41 square miles. Based on the flood-frequency estimates from the 20 streamflow-gaging stations (13 in South Carolina; 4 in North Carolina; and 3 in Georgia), generalized least-squares regression was used to develop regional regression equations. These equations can be used to estimate the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence-interval flows for small urban streams in the Piedmont, upper Coastal Plain, and lower Coastal Plain physiographic provinces of South Carolina. The most significant explanatory variables from this analysis were mainchannel length, percent impervious area, and basin development factor. Mean standard errors of prediction for the regression equations ranged from -25 to 33 percent for the 10-year recurrence-interval flows and from -35 to 54 percent for the 100-year recurrence-interval flows. The U.S. Geological Survey has developed a Geographic Information System application called StreamStats that makes the process of computing streamflow statistics at ungaged sites faster and more consistent than manual methods. This application was developed in the Massachusetts District and ongoing work is being done in other districts to develop a similar application using streamflow statistics relative to those respective States. Considering the future possibility of implementing StreamStats in South Carolina, an alternative set of regional regression equations was developed using only main channel length and impervious area. This was done because no digital coverages are currently available for basin development factor and, therefore, it could not be included in the StreamStats application. The average mean standard error of prediction for the alternative equations was 2 to 5 percent larger than the standard errors for the equations that contained basin development factor. For the urban streamflow-gaging stations in South Carolina, measured water-year peak flows were compared with those from an earlier urban flood-frequency investigation. The peak flows from the earlier investigation were computed using a rainfall-runoff model. At many of the sites, graphical comparisons indicated that the variance of the measured data was much less than the variance of the simulated data. Several statistical tests were applied to compare the variances and the means of the measured and simulated data for each site. The results indicated that the variances were significantly different for 11 of the 13 South Carolina streamflow-gaging stations. For one streamflow-gaging station, the test for normality, which is one of the assumptions of the data when comparing variances, indicated that neither the measured data nor the simulated data were distributed normally; therefore, the test for differences in the variances was not used for that streamflow-gaging station. Another statistical test was used to test for statistically significant differences in the means of the measured and simulated data. The results indicated that for 5 of the 13 urban streamflowgaging stations in South Carolina there was a statistically significant difference in the means of the two data sets. For comparison purposes and to test the hypothesis that there may have been climatic differences between the period in which the measured peak-flow data were measured and the period for which historic rainfall data were used to compute the simulated peak flows, 16 rural streamflow-gaging stations with long-term records were reviewed using similar techniques as those used for the measured an
Inference of reactive transport model parameters using a Bayesian multivariate approach
NASA Astrophysics Data System (ADS)
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
A Minimum Variance Algorithm for Overdetermined TOA Equations with an Altitude Constraint.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romero, Louis A; Mason, John J.
We present a direct (non-iterative) method for solving for the location of a radio frequency (RF) emitter, or an RF navigation receiver, using four or more time of arrival (TOA) measurements and an assumed altitude above an ellipsoidal earth. Both the emitter tracking problem and the navigation application are governed by the same equations, but with slightly different interpreta- tions of several variables. We treat the assumed altitude as a soft constraint, with a specified noise level, just as the TOA measurements are handled, with their respective noise levels. With 4 or more TOA measurements and the assumed altitude, themore » problem is overdetermined and is solved in the weighted least squares sense for the 4 unknowns, the 3-dimensional position and time. We call the new technique the TAQMV (TOA Altitude Quartic Minimum Variance) algorithm, and it achieves the minimum possible error variance for given levels of TOA and altitude estimate noise. The method algebraically produces four solutions, the least-squares solution, and potentially three other low residual solutions, if they exist. In the lightly overdermined cases where multiple local minima in the residual error surface are more likely to occur, this algebraic approach can produce all of the minima even when an iterative approach fails to converge. Algorithm performance in terms of solution error variance and divergence rate for bas eline (iterative) and proposed approach are given in tables.« less
Sangnawakij, Patarawan; Böhning, Dankmar; Adams, Stephen; Stanton, Michael; Holling, Heinz
2017-04-30
Statistical inference for analyzing the results from several independent studies on the same quantity of interest has been investigated frequently in recent decades. Typically, any meta-analytic inference requires that the quantity of interest is available from each study together with an estimate of its variability. The current work is motivated by a meta-analysis on comparing two treatments (thoracoscopic and open) of congenital lung malformations in young children. Quantities of interest include continuous end-points such as length of operation or number of chest tube days. As studies only report mean values (and no standard errors or confidence intervals), the question arises how meta-analytic inference can be developed. We suggest two methods to estimate study-specific variances in such a meta-analysis, where only sample means and sample sizes are available in the treatment arms. A general likelihood ratio test is derived for testing equality of variances in two groups. By means of simulation studies, the bias and estimated standard error of the overall mean difference from both methodologies are evaluated and compared with two existing approaches: complete study analysis only and partial variance information. The performance of the test is evaluated in terms of type I error. Additionally, we illustrate these methods in the meta-analysis on comparing thoracoscopic and open surgery for congenital lung malformations and in a meta-analysis on the change in renal function after kidney donation. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Pataky, Todd C; Vanrenterghem, Jos; Robinson, Mark A
2016-06-14
A false positive is the mistake of inferring an effect when none exists, and although α controls the false positive (Type I error) rate in classical hypothesis testing, a given α value is accurate only if the underlying model of randomness appropriately reflects experimentally observed variance. Hypotheses pertaining to one-dimensional (1D) (e.g. time-varying) biomechanical trajectories are most often tested using a traditional zero-dimensional (0D) Gaussian model of randomness, but variance in these datasets is clearly 1D. The purpose of this study was to determine the likelihood that analyzing smooth 1D data with a 0D model of variance will produce false positives. We first used random field theory (RFT) to predict the probability of false positives in 0D analyses. We then validated RFT predictions via numerical simulations of smooth Gaussian 1D trajectories. Results showed that, across a range of public kinematic, force/moment and EMG datasets, the median false positive rate was 0.382 and not the assumed α=0.05, even for a simple two-sample t test involving N=10 trajectories per group. The median false positive rate for experiments involving three-component vector trajectories was p=0.764. This rate increased to p=0.945 for two three-component vector trajectories, and to p=0.999 for six three-component vectors. This implies that experiments involving vector trajectories have a high probability of yielding 0D statistical significance when there is, in fact, no 1D effect. Either (a) explicit a priori identification of 0D variables or (b) adoption of 1D methods can more tightly control α. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bammann, K; Huybrechts, I; Vicente-Rodriguez, G; Easton, C; De Vriendt, T; Marild, S; Mesana, M I; Peeters, M W; Reilly, J J; Sioen, I; Tubic, B; Wawro, N; Wells, J C; Westerterp, K; Pitsiladis, Y; Moreno, L A
2013-04-01
To compare different field methods for estimating body fat mass with a reference value derived by a three-component (3C) model in pre-school and school children across Europe. Multicentre validation study. Seventy-eight preschool/school children aged 4-10 years from four different European countries. A standard measurement protocol was carried out in all children by trained field workers. A 3C model was used as the reference method. The field methods included height and weight measurement, circumferences measured at four sites, skinfold measured at two-six sites and foot-to-foot bioelectrical resistance (BIA) via TANITA scales. With the exception of height and neck circumference, all single measurements were able to explain at least 74% of the fat-mass variance in the sample. In combination, circumference models were superior to skinfold models and height-weight models. The best predictions were given by trunk models (combining skinfold and circumference measurements) that explained 91% of the observed fat-mass variance. The optimal data-driven model for our sample includes hip circumference, triceps skinfold and total body mass minus resistance index, and explains 94% of the fat-mass variance with 2.44 kg fat mass limits of agreement. In all investigated models, prediction errors were associated with fat mass, although to a lesser degree in the investigated skinfold models, arm models and the data-driven models. When studying total body fat in childhood populations, anthropometric measurements will give biased estimations as compared to gold standard measurements. Nevertheless, our study shows that when combining circumference and skinfold measurements, estimations of fat mass can be obtained with a limit of agreement of 1.91 kg in normal weight children and of 2.94 kg in overweight or obese children.
Garcia, Tanya P; Ma, Yanyuan
2017-10-01
We develop consistent and efficient estimation of parameters in general regression models with mismeasured covariates. We assume the model error and covariate distributions are unspecified, and the measurement error distribution is a general parametric distribution with unknown variance-covariance. We construct root- n consistent, asymptotically normal and locally efficient estimators using the semiparametric efficient score. We do not estimate any unknown distribution or model error heteroskedasticity. Instead, we form the estimator under possibly incorrect working distribution models for the model error, error-prone covariate, or both. Empirical results demonstrate robustness to different incorrect working models in homoscedastic and heteroskedastic models with error-prone covariates.
Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test
ERIC Educational Resources Information Center
Lee, Yi-Hsuan; Zhang, Jinming
2017-01-01
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
An Investigation of the Raudenbush (1988) Test for Studying Variance Heterogeneity.
ERIC Educational Resources Information Center
Harwell, Michael
1997-01-01
The meta-analytic method proposed by S. W. Raudenbush (1988) for studying variance heterogeneity was studied. Results of a Monte Carlo study indicate that the Type I error rate of the test is sensitive to even modestly platykurtic score distributions and to the ratio of study sample size to the number of studies. (SLD)
Stabilizing Conditional Standard Errors of Measurement in Scale Score Transformations
ERIC Educational Resources Information Center
Moses, Tim; Kim, YoungKoung
2017-01-01
The focus of this article is on scale score transformations that can be used to stabilize conditional standard errors of measurement (CSEMs). Three transformations for stabilizing the estimated CSEMs are reviewed, including the traditional arcsine transformation, a recently developed general variance stabilization transformation, and a new method…
New Statistical Techniques for Evaluating Longitudinal Models.
ERIC Educational Resources Information Center
Murray, James R.; Wiley, David E.
A basic methodological approach in developmental studies is the collection of longitudinal data. Behavioral data cen take at least two forms, qualitative (or discrete) and quantitative. Both types are fallible. Measurement errors can occur in quantitative data and measures of these are based on error variance. Qualitative or discrete data can…
Determinants of Standard Errors of MLEs in Confirmatory Factor Analysis
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Cheng, Ying; Zhang, Wei
2010-01-01
This paper studies changes of standard errors (SE) of the normal-distribution-based maximum likelihood estimates (MLE) for confirmatory factor models as model parameters vary. Using logical analysis, simplified formulas and numerical verification, monotonic relationships between SEs and factor loadings as well as unique variances are found.…
The error structure of the SMAP single and dual channel soil moisture retrievals
USDA-ARS?s Scientific Manuscript database
Knowledge of the temporal error structure for remotely-sensed surface soil moisture retrievals can improve our ability to exploit them for hydrology and climate studies. This study employs a triple collocation type analysis to investigate both the total variance and temporal auto-correlation of erro...
Running Speed Can Be Predicted from Foot Contact Time during Outdoor over Ground Running.
de Ruiter, Cornelis J; van Oeveren, Ben; Francke, Agnieta; Zijlstra, Patrick; van Dieen, Jaap H
2016-01-01
The number of validation studies of commercially available foot pods that provide estimates of running speed is limited and these studies have been conducted under laboratory conditions. Moreover, internal data handling and algorithms used to derive speed from these pods are proprietary and thereby unclear. The present study investigates the use of foot contact time (CT) for running speed estimations, which potentially can be used in addition to the global positioning system (GPS) in situations where GPS performance is limited. CT was measured with tri axial inertial sensors attached to the feet of 14 runners, during natural over ground outdoor running, under optimized conditions for GPS. The individual relationships between running speed and CT were established during short runs at different speeds on two days. These relations were subsequently used to predict instantaneous speed during a straight line 4 km run with a single turning point halfway. Stopwatch derived speed, measured for each of 32 consecutive 125m intervals during the 4 km runs, was used as reference. Individual speed-CT relations were strong (r2 >0.96 for all trials) and consistent between days. During the 4km runs, median error (ranges) in predicted speed from CT 2.5% (5.2) was higher (P<0.05) than for GPS 1.6% (0.8). However, around the turning point and during the first and last 125m interval, error for GPS-speed increased to 5.0% (4.5) and became greater (P<0.05) than the error predicted from CT: 2.7% (4.4). Small speed fluctuations during 4km runs were adequately monitored with both methods: CT and GPS respectively explained 85% and 73% of the total speed variance during 4km runs. In conclusion, running speed estimates bases on speed-CT relations, have acceptable accuracy and could serve to backup or substitute for GPS during tarmac running on flat terrain whenever GPS performance is limited.
NASA Astrophysics Data System (ADS)
Cowdery, E.; Dietze, M.
2017-12-01
As atmospheric levels of carbon dioxide levels continue to increase, it is critical that terrestrial ecosystem models can accurately predict ecological responses to the changing environment. Current predictions of net primary productivity (NPP) in response to elevated atmospheric CO2 concentration are highly variable and contain a considerable amount of uncertainty. Benchmarking model predictions against data are necessary to assess their ability to replicate observed patterns, but also to identify and evaluate the assumptions causing inter-model differences. We have implemented a novel benchmarking workflow as part of the Predictive Ecosystem Analyzer (PEcAn) that is automated, repeatable, and generalized to incorporate different sites and ecological models. Building on the recent Free-Air CO2 Enrichment Model Data Synthesis (FACE-MDS) project, we used observational data from the FACE experiments to test this flexible, extensible benchmarking approach aimed at providing repeatable tests of model process representation that can be performed quickly and frequently. Model performance assessments are often limited to traditional residual error analysis; however, this can result in a loss of critical information. Models that fail tests of relative measures of fit may still perform well under measures of absolute fit and mathematical similarity. This implies that models that are discounted as poor predictors of ecological productivity may still be capturing important patterns. Conversely, models that have been found to be good predictors of productivity may be hiding error in their sub-process that result in the right answers for the wrong reasons. Our suite of tests have not only highlighted process based sources of uncertainty in model productivity calculations, they have also quantified the patterns and scale of this error. Combining these findings with PEcAn's model sensitivity analysis and variance decomposition strengthen our ability to identify which processes need further study and additional data constraints. This can be used to inform future experimental design and in turn can provide an informative starting point for data assimilation.
Strategies for Selecting Crosses Using Genomic Prediction in Two Wheat Breeding Programs.
Lado, Bettina; Battenfield, Sarah; Guzmán, Carlos; Quincke, Martín; Singh, Ravi P; Dreisigacker, Susanne; Peña, R Javier; Fritz, Allan; Silva, Paula; Poland, Jesse; Gutiérrez, Lucía
2017-07-01
The single most important decision in plant breeding programs is the selection of appropriate crosses. The ideal cross would provide superior predicted progeny performance and enough diversity to maintain genetic gain. The aim of this study was to compare the best crosses predicted using combinations of mid-parent value and variance prediction accounting for linkage disequilibrium (V) or assuming linkage equilibrium (V). After predicting the mean and the variance of each cross, we selected crosses based on mid-parent value, the top 10% of the progeny, and weighted mean and variance within progenies for grain yield, grain protein content, mixing time, and loaf volume in two applied wheat ( L.) breeding programs: Instituto Nacional de Investigación Agropecuaria (INIA) Uruguay and CIMMYT Mexico. Although the variance of the progeny is important to increase the chances of finding superior individuals from transgressive segregation, we observed that the mid-parent values of the crosses drove the genetic gain but the variance of the progeny had a small impact on genetic gain for grain yield. However, the relative importance of the variance of the progeny was larger for quality traits. Overall, the genomic resources and the statistical models are now available to plant breeders to predict both the performance of breeding lines per se as well as the value of progeny from any potential crosses. Copyright © 2017 Crop Science Society of America.
Trattner, Sigal; Prinsen, Peter; Wiegert, Jens; Gerland, Elazar-Lars; Shefer, Efrat; Morton, Tom; Thompson, Carla M; Yagil, Yoad; Cheng, Bin; Jambawalikar, Sachin; Al-Senan, Rani; Amurao, Maxwell; Halliburton, Sandra S; Einstein, Andrew J
2017-12-01
Metal-oxide-semiconductor field-effect transistors (MOSFETs) serve as a helpful tool for organ radiation dosimetry and their use has grown in computed tomography (CT). While different approaches have been used for MOSFET calibration, those using the commonly available 100 mm pencil ionization chamber have not incorporated measurements performed throughout its length, and moreover, no previous work has rigorously evaluated the multiple sources of error involved in MOSFET calibration. In this paper, we propose a new MOSFET calibration approach to translate MOSFET voltage measurements into absorbed dose from CT, based on serial measurements performed throughout the length of a 100-mm ionization chamber, and perform an analysis of the errors of MOSFET voltage measurements and four sources of error in calibration. MOSFET calibration was performed at two sites, to determine single calibration factors for tube potentials of 80, 100, and 120 kVp, using a 100-mm-long pencil ion chamber and a cylindrical computed tomography dose index (CTDI) phantom of 32 cm diameter. The dose profile along the 100-mm ion chamber axis was sampled in 5 mm intervals by nine MOSFETs in the nine holes of the CTDI phantom. Variance of the absorbed dose was modeled as a sum of the MOSFET voltage measurement variance and the calibration factor variance, the latter being comprised of three main subcomponents: ionization chamber reading variance, MOSFET-to-MOSFET variation and a contribution related to the fact that the average calibration factor of a few MOSFETs was used as an estimate for the average value of all MOSFETs. MOSFET voltage measurement error was estimated based on sets of repeated measurements. The calibration factor overall voltage measurement error was calculated from the above analysis. Calibration factors determined were close to those reported in the literature and by the manufacturer (~3 mV/mGy), ranging from 2.87 to 3.13 mV/mGy. The error σ V of a MOSFET voltage measurement was shown to be proportional to the square root of the voltage V: σV=cV where c = 0.11 mV. A main contributor to the error in the calibration factor was the ionization chamber reading error with 5% error. The usage of a single calibration factor for all MOSFETs introduced an additional error of about 5-7%, depending on the number of MOSFETs that were used to determine the single calibration factor. The expected overall error in a high-dose region (~30 mGy) was estimated to be about 8%, compared to 6% when an individual MOSFET calibration was performed. For a low-dose region (~3 mGy), these values were 13% and 12%. A MOSFET calibration method was developed using a 100-mm pencil ion chamber and a CTDI phantom, accompanied by an absorbed dose error analysis reflecting multiple sources of measurement error. When using a single calibration factor, per tube potential, for different MOSFETs, only a small error was introduced into absorbed dose determinations, thus supporting the use of a single calibration factor for experiments involving many MOSFETs, such as those required to accurately estimate radiation effective dose. © 2017 American Association of Physicists in Medicine.
Eberhard, Wynn L
2017-04-01
The maximum likelihood estimator (MLE) is derived for retrieving the extinction coefficient and zero-range intercept in the lidar slope method in the presence of random and independent Gaussian noise. Least-squares fitting, weighted by the inverse of the noise variance, is equivalent to the MLE. Monte Carlo simulations demonstrate that two traditional least-squares fitting schemes, which use different weights, are less accurate. Alternative fitting schemes that have some positive attributes are introduced and evaluated. The principal factors governing accuracy of all these schemes are elucidated. Applying these schemes to data with Poisson rather than Gaussian noise alters accuracy little, even when the signal-to-noise ratio is low. Methods to estimate optimum weighting factors in actual data are presented. Even when the weighting estimates are coarse, retrieval accuracy declines only modestly. Mathematical tools are described for predicting retrieval accuracy. Least-squares fitting with inverse variance weighting has optimum accuracy for retrieval of parameters from single-wavelength lidar measurements when noise, errors, and uncertainties are Gaussian distributed, or close to optimum when only approximately Gaussian.
Smoothing of the bivariate LOD score for non-normal quantitative traits.
Buil, Alfonso; Dyer, Thomas D; Almasy, Laura; Blangero, John
2005-12-30
Variance component analysis provides an efficient method for performing linkage analysis for quantitative traits. However, type I error of variance components-based likelihood ratio testing may be affected when phenotypic data are non-normally distributed (especially with high values of kurtosis). This results in inflated LOD scores when the normality assumption does not hold. Even though different solutions have been proposed to deal with this problem with univariate phenotypes, little work has been done in the multivariate case. We present an empirical approach to adjust the inflated LOD scores obtained from a bivariate phenotype that violates the assumption of normality. Using the Collaborative Study on the Genetics of Alcoholism data available for the Genetic Analysis Workshop 14, we show how bivariate linkage analysis with leptokurtotic traits gives an inflated type I error. We perform a novel correction that achieves acceptable levels of type I error.
Inference for dynamics of continuous variables: the extended Plefka expansion with hidden nodes
NASA Astrophysics Data System (ADS)
Bravi, B.; Sollich, P.
2017-06-01
We consider the problem of a subnetwork of observed nodes embedded into a larger bulk of unknown (i.e. hidden) nodes, where the aim is to infer these hidden states given information about the subnetwork dynamics. The biochemical networks underlying many cellular and metabolic processes are important realizations of such a scenario as typically one is interested in reconstructing the time evolution of unobserved chemical concentrations starting from the experimentally more accessible ones. We present an application to this problem of a novel dynamical mean field approximation, the extended Plefka expansion, which is based on a path integral description of the stochastic dynamics. As a paradigmatic model we study the stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings. The resulting joint distribution is known to be Gaussian and this allows us to fully characterize the posterior statistics of the hidden nodes. In particular the equal-time hidden-to-hidden variance—conditioned on observations—gives the expected error at each node when the hidden time courses are predicted based on the observations. We assess the accuracy of the extended Plefka expansion in predicting these single node variances as well as error correlations over time, focussing on the role of the system size and the number of observed nodes.
Clasey, Jody L; Gater, David R
2005-11-01
To compare (1) total body volume (V(b)) and density (D(b)) measurements obtained by hydrostatic weighing (HW) and air displacement plethysmography (ADP) in adults with spinal cord injury (SCI); (2) measured and predicted thoracic gas volume (V(TG)); and (3) differences in percentage of fat measurements using ADP-obtained D(b) and HW-obtained D(b) measures that were interchanged in a 4-compartment body composition model (4-comp %fat). Twenty adults with SCI underwent ADP and V(TG), and HW testing. In a subgroup (n=13) of subjects, 4-comp %fat procedures were computed. Research laboratories in a university setting. Twenty adults with SCI below the T3 vertebrae and motor complete paraplegia. Not applicable. Statistical analyses, including determination of group mean differences, shared variance, total error, and 95% confidence intervals. The 2 methods yielded small yet significantly different V(b) and D(b). The groups' mean V(TG) did not differ significantly, but the large relative differences indicated an unacceptable amount of individual error. When the 4-comp %fat measurements were compared, there was a trend toward significant differences (P=.08). ADP is a valid alternative method of determining the V(b) and D(b) in adults with SCI; however, the predicted V(TG) should be used with caution.
Jin, Sen; Liu, Bo-Fei; Di, Xue-Ying; Chu, Teng-Fei; Zhang, Ji-Li
2012-01-01
Aimed to understand the fire behavior of Mongolian oak leaves fuel-bed under field condition, the leaves of a secondary Mongolian oak forest in Northeast Forestry University experimental forest farm were collected and brought into laboratory to construct fuel-beds with varied loading, height, and moisture content, and a total of 100 experimental fires were burned under no-wind and zero-slope conditions. It was observed that the fire spread rate of the fuel-beds was less than 0.5 m x min(-1). Fuel-bed loading, height, and moisture contents all had significant effects on the fire spread rate. The effect of fuel-bed moisture content on the fire spread had no significant correlations with fuel-bed loading and height, but the effect of fuel-bed height was related to the fuel-bed loading. The packing ratio of fuel-beds had less effect on the fire spread rate. Taking the fuel-bed loading, height, and moisture content as predictive variables, a prediction model for the fire spread rate of Mongolian oak leaves fuel-bed was established, which could explain 83% of the variance of the fire spread rate, with a mean absolute error 0.04 m x min(-1) and a mean relative error less than 17%.
Forecast models for suicide: Time-series analysis with data from Italy.
Preti, Antonio; Lentini, Gianluca
2016-01-01
The prediction of suicidal behavior is a complex task. To fine-tune targeted preventative interventions, predictive analytics (i.e. forecasting future risk of suicide) is more important than exploratory data analysis (pattern recognition, e.g. detection of seasonality in suicide time series). This study sets out to investigate the accuracy of forecasting models of suicide for men and women. A total of 101 499 male suicides and of 39 681 female suicides - occurred in Italy from 1969 to 2003 - were investigated. In order to apply the forecasting model and test its accuracy, the time series were split into a training set (1969 to 1996; 336 months) and a test set (1997 to 2003; 84 months). The main outcome was the accuracy of forecasting models on the monthly number of suicides. These measures of accuracy were used: mean absolute error; root mean squared error; mean absolute percentage error; mean absolute scaled error. In both male and female suicides a change in the trend pattern was observed, with an increase from 1969 onwards to reach a maximum around 1990 and decrease thereafter. The variances attributable to the seasonal and trend components were, respectively, 24% and 64% in male suicides, and 28% and 41% in female ones. Both annual and seasonal historical trends of monthly data contributed to forecast future trends of suicide with a margin of error around 10%. The finding is clearer in male than in female time series of suicide. The main conclusion of the study is that models taking seasonality into account seem to be able to derive information on deviation from the mean when this occurs as a zenith, but they fail to reproduce it when it occurs as a nadir. Preventative efforts should concentrate on the factors that influence the occurrence of increases above the main trend in both seasonal and cyclic patterns of suicides.
Statistical image quantification toward optimal scan fusion and change quantification
NASA Astrophysics Data System (ADS)
Potesil, Vaclav; Zhou, Xiang Sean
2007-03-01
Recent advance of imaging technology has brought new challenges and opportunities for automatic and quantitative analysis of medical images. With broader accessibility of more imaging modalities for more patients, fusion of modalities/scans from one time point and longitudinal analysis of changes across time points have become the two most critical differentiators to support more informed, more reliable and more reproducible diagnosis and therapy decisions. Unfortunately, scan fusion and longitudinal analysis are both inherently plagued with increased levels of statistical errors. A lack of comprehensive analysis by imaging scientists and a lack of full awareness by physicians pose potential risks in clinical practice. In this paper, we discuss several key error factors affecting imaging quantification, studying their interactions, and introducing a simulation strategy to establish general error bounds for change quantification across time. We quantitatively show that image resolution, voxel anisotropy, lesion size, eccentricity, and orientation are all contributing factors to quantification error; and there is an intricate relationship between voxel anisotropy and lesion shape in affecting quantification error. Specifically, when two or more scans are to be fused at feature level, optimal linear fusion analysis reveals that scans with voxel anisotropy aligned with lesion elongation should receive a higher weight than other scans. As a result of such optimal linear fusion, we will achieve a lower variance than naïve averaging. Simulated experiments are used to validate theoretical predictions. Future work based on the proposed simulation methods may lead to general guidelines and error lower bounds for quantitative image analysis and change detection.
Measurement error associated with surveys of fish abundance in Lake Michigan
Krause, Ann E.; Hayes, Daniel B.; Bence, James R.; Madenjian, Charles P.; Stedman, Ralph M.
2002-01-01
In fisheries, imprecise measurements in catch data from surveys adds uncertainty to the results of fishery stock assessments. The USGS Great Lakes Science Center (GLSC) began to survey the fall fish community of Lake Michigan in 1962 with bottom trawls. The measurement error was evaluated at the level of individual tows for nine fish species collected in this survey by applying a measurement-error regression model to replicated trawl data. It was found that the estimates of measurement-error variance ranged from 0.37 (deepwater sculpin, Myoxocephalus thompsoni) to 1.23 (alewife, Alosa pseudoharengus) on a logarithmic scale corresponding to a coefficient of variation = 66% to 156%. The estimates appeared to increase with the range of temperature occupied by the fish species. This association may be a result of the variability in the fall thermal structure of the lake. The estimates may also be influenced by other factors, such as pelagic behavior and schooling. Measurement error might be reduced by surveying the fish community during other seasons and/or by using additional technologies, such as acoustics. Measurement-error estimates should be considered when interpreting results of assessments that use abundance information from USGS-GLSC surveys of Lake Michigan and could be used if the survey design was altered. This study is the first to report estimates of measurement-error variance associated with this survey.
Predicting vertical jump height from bar velocity.
García-Ramos, Amador; Štirn, Igor; Padial, Paulino; Argüelles-Cienfuegos, Javier; De la Fuente, Blanca; Strojnik, Vojko; Feriche, Belén
2015-06-01
The objective of the study was to assess the use of maximum (Vmax) and final propulsive phase (FPV) bar velocity to predict jump height in the weighted jump squat. FPV was defined as the velocity reached just before bar acceleration was lower than gravity (-9.81 m·s(-2)). Vertical jump height was calculated from the take-off velocity (Vtake-off) provided by a force platform. Thirty swimmers belonging to the National Slovenian swimming team performed a jump squat incremental loading test, lifting 25%, 50%, 75% and 100% of body weight in a Smith machine. Jump performance was simultaneously monitored using an AMTI portable force platform and a linear velocity transducer attached to the barbell. Simple linear regression was used to estimate jump height from the Vmax and FPV recorded by the linear velocity transducer. Vmax (y = 16.577x - 16.384) was able to explain 93% of jump height variance with a standard error of the estimate of 1.47 cm. FPV (y = 12.828x - 6.504) was able to explain 91% of jump height variance with a standard error of the estimate of 1.66 cm. Despite that both variables resulted to be good predictors, heteroscedasticity in the differences between FPV and Vtake-off was observed (r(2) = 0.307), while the differences between Vmax and Vtake-off were homogenously distributed (r(2) = 0.071). These results suggest that Vmax is a valid tool for estimating vertical jump height in a loaded jump squat test performed in a Smith machine. Key pointsVertical jump height in the loaded jump squat can be estimated with acceptable precision from the maximum bar velocity recorded by a linear velocity transducer.The relationship between the point at which bar acceleration is less than -9.81 m·s(-2) and the real take-off is affected by the velocity of movement.Mean propulsive velocity recorded by a linear velocity transducer does not appear to be optimal to monitor ballistic exercise performance.
Predicting Vertical Jump Height from Bar Velocity
García-Ramos, Amador; Štirn, Igor; Padial, Paulino; Argüelles-Cienfuegos, Javier; De la Fuente, Blanca; Strojnik, Vojko; Feriche, Belén
2015-01-01
The objective of the study was to assess the use of maximum (Vmax) and final propulsive phase (FPV) bar velocity to predict jump height in the weighted jump squat. FPV was defined as the velocity reached just before bar acceleration was lower than gravity (-9.81 m·s-2). Vertical jump height was calculated from the take-off velocity (Vtake-off) provided by a force platform. Thirty swimmers belonging to the National Slovenian swimming team performed a jump squat incremental loading test, lifting 25%, 50%, 75% and 100% of body weight in a Smith machine. Jump performance was simultaneously monitored using an AMTI portable force platform and a linear velocity transducer attached to the barbell. Simple linear regression was used to estimate jump height from the Vmax and FPV recorded by the linear velocity transducer. Vmax (y = 16.577x - 16.384) was able to explain 93% of jump height variance with a standard error of the estimate of 1.47 cm. FPV (y = 12.828x - 6.504) was able to explain 91% of jump height variance with a standard error of the estimate of 1.66 cm. Despite that both variables resulted to be good predictors, heteroscedasticity in the differences between FPV and Vtake-off was observed (r2 = 0.307), while the differences between Vmax and Vtake-off were homogenously distributed (r2 = 0.071). These results suggest that Vmax is a valid tool for estimating vertical jump height in a loaded jump squat test performed in a Smith machine. Key points Vertical jump height in the loaded jump squat can be estimated with acceptable precision from the maximum bar velocity recorded by a linear velocity transducer. The relationship between the point at which bar acceleration is less than -9.81 m·s-2 and the real take-off is affected by the velocity of movement. Mean propulsive velocity recorded by a linear velocity transducer does not appear to be optimal to monitor ballistic exercise performance. PMID:25983572
NASA Astrophysics Data System (ADS)
Bouhaj, M.; von Estorff, O.; Peiffer, A.
2017-09-01
In the application of Statistical Energy Analysis "SEA" to complex assembled structures, a purely predictive model often exhibits errors. These errors are mainly due to a lack of accurate modelling of the power transmission mechanism described through the Coupling Loss Factors (CLF). Experimental SEA (ESEA) is practically used by the automotive and aerospace industry to verify and update the model or to derive the CLFs for use in an SEA predictive model when analytical estimates cannot be made. This work is particularly motivated by the lack of procedures that allow an estimate to be made of the variance and confidence intervals of the statistical quantities when using the ESEA technique. The aim of this paper is to introduce procedures enabling a statistical description of measured power input, vibration energies and the derived SEA parameters. Particular emphasis is placed on the identification of structural CLFs of complex built-up structures comparing different methods. By adopting a Stochastic Energy Model (SEM), the ensemble average in ESEA is also addressed. For this purpose, expressions are obtained to randomly perturb the energy matrix elements and generate individual samples for the Monte Carlo (MC) technique applied to derive the ensemble averaged CLF. From results of ESEA tests conducted on an aircraft fuselage section, the SEM approach provides a better performance of estimated CLFs compared to classical matrix inversion methods. The expected range of CLF values and the synthesized energy are used as quality criteria of the matrix inversion, allowing to assess critical SEA subsystems, which might require a more refined statistical description of the excitation and the response fields. Moreover, the impact of the variance of the normalized vibration energy on uncertainty of the derived CLFs is outlined.
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
Training set optimization under population structure in genomic selection.
Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E
2015-01-01
Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.
An internal pilot design for prospective cancer screening trials with unknown disease prevalence.
Brinton, John T; Ringham, Brandy M; Glueck, Deborah H
2015-10-13
For studies that compare the diagnostic accuracy of two screening tests, the sample size depends on the prevalence of disease in the study population, and on the variance of the outcome. Both parameters may be unknown during the design stage, which makes finding an accurate sample size difficult. To solve this problem, we propose adapting an internal pilot design. In this adapted design, researchers will accrue some percentage of the planned sample size, then estimate both the disease prevalence and the variances of the screening tests. The updated estimates of the disease prevalence and variance are used to conduct a more accurate power and sample size calculation. We demonstrate that in large samples, the adapted internal pilot design produces no Type I inflation. For small samples (N less than 50), we introduce a novel adjustment of the critical value to control the Type I error rate. We apply the method to two proposed prospective cancer screening studies: 1) a small oral cancer screening study in individuals with Fanconi anemia and 2) a large oral cancer screening trial. Conducting an internal pilot study without adjusting the critical value can cause Type I error rate inflation in small samples, but not in large samples. An internal pilot approach usually achieves goal power and, for most studies with sample size greater than 50, requires no Type I error correction. Further, we have provided a flexible and accurate approach to bound Type I error below a goal level for studies with small sample size.
Statistical methods for biodosimetry in the presence of both Berkson and classical measurement error
NASA Astrophysics Data System (ADS)
Miller, Austin
In radiation epidemiology, the true dose received by those exposed cannot be assessed directly. Physical dosimetry uses a deterministic function of the source term, distance and shielding to estimate dose. For the atomic bomb survivors, the physical dosimetry system is well established. The classical measurement errors plaguing the location and shielding inputs to the physical dosimetry system are well known. Adjusting for the associated biases requires an estimate for the classical measurement error variance, for which no data-driven estimate exists. In this case, an instrumental variable solution is the most viable option to overcome the classical measurement error indeterminacy. Biological indicators of dose may serve as instrumental variables. Specification of the biodosimeter dose-response model requires identification of the radiosensitivity variables, for which we develop statistical definitions and variables. More recently, researchers have recognized Berkson error in the dose estimates, introduced by averaging assumptions for many components in the physical dosimetry system. We show that Berkson error induces a bias in the instrumental variable estimate of the dose-response coefficient, and then address the estimation problem. This model is specified by developing an instrumental variable mixed measurement error likelihood function, which is then maximized using a Monte Carlo EM Algorithm. These methods produce dose estimates that incorporate information from both physical and biological indicators of dose, as well as the first instrumental variable based data-driven estimate for the classical measurement error variance.
Modeling conflict and error in the medial frontal cortex.
Mayer, Andrew R; Teshiba, Terri M; Franco, Alexandre R; Ling, Josef; Shane, Matthew S; Stephen, Julia M; Jung, Rex E
2012-12-01
Despite intensive study, the role of the dorsal medial frontal cortex (dMFC) in error monitoring and conflict processing remains actively debated. The current experiment manipulated conflict type (stimulus conflict only or stimulus and response selection conflict) and utilized a novel modeling approach to isolate error and conflict variance during a multimodal numeric Stroop task. Specifically, hemodynamic response functions resulting from two statistical models that either included or isolated variance arising from relatively few error trials were directly contrasted. Twenty-four participants completed the task while undergoing event-related functional magnetic resonance imaging on a 1.5-Tesla scanner. Response times monotonically increased based on the presence of pure stimulus or stimulus and response selection conflict. Functional results indicated that dMFC activity was present during trials requiring response selection and inhibition of competing motor responses, but absent during trials involving pure stimulus conflict. A comparison of the different statistical models suggested that relatively few error trials contributed to a disproportionate amount of variance (i.e., activity) throughout the dMFC, but particularly within the rostral anterior cingulate gyrus (rACC). Finally, functional connectivity analyses indicated that an empirically derived seed in the dorsal ACC/pre-SMA exhibited strong connectivity (i.e., positive correlation) with prefrontal and inferior parietal cortex but was anti-correlated with the default-mode network. An empirically derived seed from the rACC exhibited the opposite pattern, suggesting that sub-regions of the dMFC exhibit different connectivity patterns with other large scale networks implicated in internal mentations such as daydreaming (default-mode) versus the execution of top-down attentional control (fronto-parietal). Copyright © 2011 Wiley Periodicals, Inc.
On the assimilation set-up of ASCAT soil moisture data for improving streamflow catchment simulation
NASA Astrophysics Data System (ADS)
Loizu, Javier; Massari, Christian; Álvarez-Mozos, Jesús; Tarpanelli, Angelica; Brocca, Luca; Casalí, Javier
2018-01-01
Assimilation of remotely sensed surface soil moisture (SSM) data into hydrological catchment models has been identified as a means to improve streamflow simulations, but reported results vary markedly depending on the particular model, catchment and assimilation procedure used. In this study, the influence of key aspects, such as the type of model, re-scaling technique and SSM observation error considered, were evaluated. For this aim, Advanced SCATterometer ASCAT-SSM observations were assimilated through the ensemble Kalman filter into two hydrological models of different complexity (namely MISDc and TOPLATS) run on two Mediterranean catchments of similar size (750 km2). Three different re-scaling techniques were evaluated (linear re-scaling, variance matching and cumulative distribution function matching), and SSM observation error values ranging from 0.01% to 20% were considered. Four different efficiency measures were used for evaluating the results. Increases in Nash-Sutcliffe efficiency (0.03-0.15) and efficiency indices (10-45%) were obtained, especially when linear re-scaling and observation errors within 4-6% were considered. This study found out that there is a potential to improve streamflow prediction through data assimilation of remotely sensed SSM in catchments of different characteristics and with hydrological models of different conceptualizations schemes, but for that, a careful evaluation of the observation error and re-scaling technique set-up utilized is required.
A Negative Binomial Regression Model for Accuracy Tests
ERIC Educational Resources Information Center
Hung, Lai-Fa
2012-01-01
Rasch used a Poisson model to analyze errors and speed in reading tests. An important property of the Poisson distribution is that the mean and variance are equal. However, in social science research, it is very common for the variance to be greater than the mean (i.e., the data are overdispersed). This study embeds the Rasch model within an…
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Estimating the Autocorrelated Error Model with Trended Data: Further Results,
1979-11-01
Perhaps the most serious deficiency of OLS in the presence of autocorrelation is not inefficiency but bias in its estimated standard errors--a bias...k for all t has variance var(b) = o2/ Tk2 2This refutes Maeshiro’s (1976) conjecture that "an estimator utilizing relevant extraneous information
Comment on 3PL IRT Adjustment for Guessing
ERIC Educational Resources Information Center
Chiu, Ting-Wei; Camilli, Gregory
2013-01-01
Guessing behavior is an issue discussed widely with regard to multiple choice tests. Its primary effect is on number-correct scores for examinees at lower levels of proficiency. This is a systematic error or bias, which increases observed test scores. Guessing also can inflate random error variance. Correction or adjustment for guessing formulas…
Does the Assessment of Recovery Capital scale reflect a single or multiple domains?
Arndt, Stephan; Sahker, Ethan; Hedden, Suzy
2017-01-01
The goal of this study was to determine whether the 50-item Assessment of Recovery Capital scale represents a single general measure or whether multiple domains might be psychometrically useful for research or clinical applications. Data are from a cross-sectional de-identified existing program evaluation information data set with 1,138 clients entering substance use disorder treatment. Principal components and iterated factor analysis were used on the domain scores. Multiple group factor analysis provided a quasi-confirmatory factor analysis. The solution accounted for 75.24% of the total variance, suggesting that 10 factors provide a reasonably good fit. However, Tucker's congruence coefficients between the factor structure and defining weights (0.41-0.52) suggested a poor fit to the hypothesized 10-domain structure. Principal components of the 10-domain scores yielded one factor whose eigenvalue was greater than one (5.93), accounting for 75.8% of the common variance. A few domains had perceptible but small unique variance components suggesting that a few of the domains may warrant enrichment. Our findings suggest that there is one general factor, with a caveat. Using the 10 measures inflates the chance for Type I errors. Using one general measure avoids this issue, is simple to interpret, and could reduce the number of items. However, those seeking to maximally predict later recovery success may need to use the full instrument and all 10 domains.
NASA Astrophysics Data System (ADS)
Henze, D. K.; Guerrette, J.; Bousserez, N.
2016-12-01
Wildfires contribute significantly to regional haze events globally, and they are potentially becoming more commonplace with increasing droughts due to climate change. Aerosol emissions from wildfires are highly uncertain, with global annual totals varying by a factor of 2 to 3 and regional rates varying by up to a factor of 10. At the high resolution required to predict PM2.5 exposure events, this variance is attributable to differences in methodology, differing land cover datasets, spatial variation in fire locations, and limited understanding of fast transient fire behavior. Here we apply an adjoint-based online chemical inverse modeling tool, WRFDA-Chem, to constrain black carbon aerosol (BC) emissions from fires during the 2008 ARCTAS-CARB field campaign. We identify several weaknesses in the prior diurnal distribution of emissions, including a missing early morning emission peak associated with local, persistent, large-scale forest fires. On 22 June, 2008, aircraft observations are able to reduce the spread between FINNv1.0 and QFEDv2.4r8 from ×3.5 to ×2.1. On 23 and 24 June, the spread is reduced from ×3.4 to ×1.4. Using posterior error estimates, we found that emission variance improvements are limited to a small footprint surrounding the measurements. Relative BB emission variances are reduced by up to 35% near aircraft flight paths and up to 60% near IMPROVE surface sites. Due to the spatial variation of observations on multiple days, and the heterogeneous biomass burning errors on daily scales, cross-validation was not successful. Future high-resolution measurements need to be carefully planned to characterize biomass burning emission errors and control for day-to-day variation. In general, the 4D-Var inversion framework would benefit from reduced wall-time. For the problem presented, incremental 4D-Var requires 20 hours on 96 cores to reach practical optimization convergence and generate the posterior covariance matrix for a 24-hour assimilation window. We will present initial computational comparisons with a recently developed method to parallelize those calculations, which will reduce wall-time by a factor of 5 or more for all WRFDA 4D-Var applications.
Counting OCR errors in typeset text
NASA Astrophysics Data System (ADS)
Sandberg, Jonathan S.
1995-03-01
Frequently object recognition accuracy is a key component in the performance analysis of pattern matching systems. In the past three years, the results of numerous excellent and rigorous studies of OCR system typeset-character accuracy (henceforth OCR accuracy) have been published, encouraging performance comparisons between a variety of OCR products and technologies. These published figures are important; OCR vendor advertisements in the popular trade magazines lead readers to believe that published OCR accuracy figures effect market share in the lucrative OCR market. Curiously, a detailed review of many of these OCR error occurrence counting results reveals that they are not reproducible as published and they are not strictly comparable due to larger variances in the counts than would be expected by the sampling variance. Naturally, since OCR accuracy is based on a ratio of the number of OCR errors over the size of the text searched for errors, imprecise OCR error accounting leads to similar imprecision in OCR accuracy. Some published papers use informal, non-automatic, or intuitively correct OCR error accounting. Still other published results present OCR error accounting methods based on string matching algorithms such as dynamic programming using Levenshtein (edit) distance but omit critical implementation details (such as the existence of suspect markers in the OCR generated output or the weights used in the dynamic programming minimization procedure). The problem with not specifically revealing the accounting method is that the number of errors found by different methods are significantly different. This paper identifies the basic accounting methods used to measure OCR errors in typeset text and offers an evaluation and comparison of the various accounting methods.
Bezdjian, Serena; Tuvblad, Catherine; Wang, Pan; Raine, Adrian; Baker, Laura A
2014-11-01
In the present study, we investigated genetic and environmental effects on motor impulsivity from childhood to late adolescence using a longitudinal sample of twins from ages 9 to 18 years. Motor impulsivity was assessed using errors of commission (no-go errors) in a visual go/no-go task at 4 time points: ages 9-10, 11-13, 14-15, and 16-18 years. Significant genetic and nonshared environmental effects on motor impulsivity were found at each of the 4 waves of assessment with genetic factors explaining 22%-41% of the variance within each of the 4 waves. Phenotypically, children's average performance improved across age (i.e., fewer no-go errors during later assessments). Multivariate biometric analyses revealed that common genetic factors influenced 12%-40% of the variance in motor impulsivity across development, whereas nonshared environmental factors common to all time points contributed to 2%-52% of the variance. Nonshared environmental influences specific to each time point also significantly influenced motor impulsivity. Overall, results demonstrated that although genetic factors were critical to motor impulsivity across development, both common and specific nonshared environmental factors played a strong role in the development of motor impulsivity across age. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats
Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.
2012-01-01
This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.
Essays in financial economics and econometrics
NASA Astrophysics Data System (ADS)
La Spada, Gabriele
Chapter 1 (my job market paper) asks the following question: Do asset managers reach for yield because of competitive pressures in a low rate environment? I propose a tournament model of money market funds (MMFs) to study this issue. I show that funds with different costs of default respond differently to changes in interest rates, and that it is important to distinguish the role of risk-free rates from that of risk premia. An increase in the risk premium leads funds with lower default costs to increase risk-taking, while funds with higher default costs reduce risk-taking. Without changes in the premium, low risk-free rates reduce risk-taking. My empirical analysis shows that these predictions are consistent with the risk-taking of MMFs during the 2006--2008 period. Chapter 2, co-authored with Fabrizio Lillo and published in Studies in Nonlinear Dynamics and Econometrics (2014), studies the effect of round-off error (or discretization) on stationary Gaussian long-memory process. For large lags, the autocovariance is rescaled by a factor smaller than one, and we compute this factor exactly. Hence, the discretized process has the same Hurst exponent as the underlying one. We show that in presence of round-off error, two common estimators of the Hurst exponent, the local Whittle (LW) estimator and the detrended fluctuation analysis (DFA), are severely negatively biased in finite samples. We derive conditions for consistency and asymptotic normality of the LW estimator applied to discretized processes and compute the asymptotic properties of the DFA for generic long-memory processes that encompass discretized processes. Chapter 3, co-authored with Fabrizio Lillo, studies the effect of round-off error on integrated Gaussian processes with possibly correlated increments. We derive the variance and kurtosis of the realized increment process in the limit of both "small" and "large" round-off errors, and its autocovariance for large lags. We propose novel estimators for the variance and lag-one autocorrelation of the underlying, unobserved increment process. We also show that for fractionally integrated processes, the realized increments have the same Hurst exponent as the underlying ones, but the LW estimator applied to the realized series is severely negatively biased in medium-sized samples.
Dexter, Franklin; Bayman, Emine O; Dexter, Elisabeth U
2017-12-01
We examined type I and II error rates for analysis of (1) mean hospital length of stay (LOS) versus (2) percentage of hospital LOS that are overnight. These 2 end points are suitable for when LOS is treated as a secondary economic end point. We repeatedly resampled LOS for 5052 discharges of thoracoscopic wedge resections and lung lobectomy at 26 hospitals. Unequal variances t test (Welch method) and Fisher exact test both were conservative (ie, type I error rate less than nominal level). The Wilcoxon rank sum test was included as a comparator; the type I error rates did not differ from the nominal level of 0.05 or 0.01. Fisher exact test was more powerful than the unequal variances t test at detecting differences among hospitals; estimated odds ratio for obtaining P < .05 with Fisher exact test versus unequal variances t test = 1.94, with 95% confidence interval, 1.31-3.01. Fisher exact test and Wilcoxon-Mann-Whitney had comparable statistical power in terms of differentiating LOS between hospitals. For studies with LOS to be used as a secondary end point of economic interest, there is currently considerable interest in the planned analysis being for the percentage of patients suitable for ambulatory surgery (ie, hospital LOS equals 0 or 1 midnight). Our results show that there need not be a loss of statistical power when groups are compared using this binary end point, as compared with either Welch method or Wilcoxon rank sum test.
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates.
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Background Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. Methodology/Principal findings The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. Conclusion/Significance The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates. PMID:24489839
Xiao, Qiang; Gao, Yang; Hu, Dan; Tan, Hong; Wang, Tianxiang
2011-07-01
We have investigated the interactions between economic growth and industrial wastewater discharge from 1978 to 2007 in China's Hunan Province using co-integration theory and an error-correction model. Two main economic growth indicators and four representative industrial wastewater pollutants were selected to demonstrate the interaction mechanism. We found a long-term equilibrium relationship between economic growth and the discharge of industrial pollutants in wastewater between 1978 and 2007 in Hunan Province. The error-correction mechanism prevented the variable expansion for long-term relationship at quantity and scale, and the size of the error-correction parameters reflected short-term adjustments that deviate from the long-term equilibrium. When economic growth changes within a short term, the discharge of pollutants will constrain growth because the values of the parameters in the short-term equation are smaller than those in the long-term co-integrated regression equation, indicating that a remarkable long-term influence of economic growth on the discharge of industrial wastewater pollutants and that increasing pollutant discharge constrained economic growth. Economic growth is the main driving factor that affects the discharge of industrial wastewater pollutants in Hunan Province. On the other hand, the discharge constrains economic growth by producing external pressure on growth, although this feedback mechanism has a lag effect. Economic growth plays an important role in explaining the predicted decomposition of the variance in the discharge of industrial wastewater pollutants, but this discharge contributes less to predictions of the variations in economic growth.
Xiao, Qiang; Gao, Yang; Hu, Dan; Tan, Hong; Wang, Tianxiang
2011-01-01
We have investigated the interactions between economic growth and industrial wastewater discharge from 1978 to 2007 in China’s Hunan Province using co-integration theory and an error-correction model. Two main economic growth indicators and four representative industrial wastewater pollutants were selected to demonstrate the interaction mechanism. We found a long-term equilibrium relationship between economic growth and the discharge of industrial pollutants in wastewater between 1978 and 2007 in Hunan Province. The error-correction mechanism prevented the variable expansion for long-term relationship at quantity and scale, and the size of the error-correction parameters reflected short-term adjustments that deviate from the long-term equilibrium. When economic growth changes within a short term, the discharge of pollutants will constrain growth because the values of the parameters in the short-term equation are smaller than those in the long-term co-integrated regression equation, indicating that a remarkable long-term influence of economic growth on the discharge of industrial wastewater pollutants and that increasing pollutant discharge constrained economic growth. Economic growth is the main driving factor that affects the discharge of industrial wastewater pollutants in Hunan Province. On the other hand, the discharge constrains economic growth by producing external pressure on growth, although this feedback mechanism has a lag effect. Economic growth plays an important role in explaining the predicted decomposition of the variance in the discharge of industrial wastewater pollutants, but this discharge contributes less to predictions of the variations in economic growth. PMID:21845167
Prediction of oxygen consumption in cardiac rehabilitation patients performing leg ergometry
NASA Astrophysics Data System (ADS)
Alvarez, John Gershwin
The purpose of this study was two-fold. First, to determine the validity of the ACSM leg ergometry equation in the prediction of steady-state oxygen consumption (VO2) in a heterogeneous population of cardiac patients. Second, to determine whether a more accurate prediction equation could be developed for use in the cardiac population. Thirty-one cardiac rehabilitation patients participated in the study of which 24 were men and 7 were women. Biometric variables (mean +/- sd) of the participants were as follows: age = 61.9 +/- 9.5 years; height = 172.6 +/- 1.6 cm; and body mass = 82.3 +/- 10.6 kg. Subjects exercised on a MonarchTM cycle ergometer at 0, 180, 360, 540 and 720 kgm ˙ min-1. The length of each stage was five minutes. Heart rate, ECG, and VO2 were continuously monitored. Blood pressure and heart rate were collected at the end of each stage. Steady state VO 2 was calculated for each stage using the average of the last two minutes. Correlation coefficients, standard error of estimate, coefficient of determination, total error, and mean bias were used to determine the accuracy of the ACSM equation (1995). The analysis found the ACSM equation to be a valid means of estimating VO2 in cardiac patients. Simple linear regression was used to develop a new equation. Regression analysis found workload to be a significant predictor of VO2. The following equation is the result: VO2 = (1.6 x kgm ˙ min-1) + 444 ml ˙ min-1. The r of the equation was .78 (p < .05) and the standard error of estimate was 211 ml ˙ min-1. Analysis of variance was used to determine significant differences between means for actual and predicted VO2 values for each equation. The analysis found the ACSM and new equation to significantly (p < .05) under predict VO2 during unloaded pedaling. Furthermore, the ACSM equation was found to significantly (p < .05) under predict VO 2 during the first loaded stage of exercise. When the accuracy of the ACSM and new equations were compared based on correlation coefficients, coefficients of determinations, SEEs, total error, and mean bias the new equation was found to have equal or better accuracy at all workloads. The final form of the new equation is: VO2 (ml ˙ min-1) = (kgm ˙ min-1 x 1.6 ml ˙ kgm-1) + (3.5 ml ˙ kg-1 ˙ min-1 x body mass in kg) + 156 ml ˙ min-1.
Effectiveness of basic display augmentation in vehicular control by visual field cues
NASA Technical Reports Server (NTRS)
Grunwald, A. J.; Merhav, S. J.
1978-01-01
The paper investigates the effectiveness of different basic display augmentation concepts - fixed reticle, velocity vector, and predicted future vehicle path - for RPVs controlled by a vehicle-mounted TV camera. The task is lateral manual control of a low flying RPV along a straight reference line in the presence of random side gusts. The man-machine system and the visual interface are modeled as a linear time-invariant system. Minimization of a quadratic performance criterion is assumed to underlie the control strategy of a well-trained human operator. The solution for the optimal feedback matrix enables the explicit computation of the variances of lateral deviation and directional error of the vehicle and of the control force that are used as performance measures.
Ensemble Kalman filters for dynamical systems with unresolved turbulence
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grooms, Ian, E-mail: grooms@cims.nyu.edu; Lee, Yoonsang; Majda, Andrew J.
Ensemble Kalman filters are developed for turbulent dynamical systems where the forecast model does not resolve all the active scales of motion. Coarse-resolution models are intended to predict the large-scale part of the true dynamics, but observations invariably include contributions from both the resolved large scales and the unresolved small scales. The error due to the contribution of unresolved scales to the observations, called ‘representation’ or ‘representativeness’ error, is often included as part of the observation error, in addition to the raw measurement error, when estimating the large-scale part of the system. It is here shown how stochastic superparameterization (amore » multiscale method for subgridscale parameterization) can be used to provide estimates of the statistics of the unresolved scales. In addition, a new framework is developed wherein small-scale statistics can be used to estimate both the resolved and unresolved components of the solution. The one-dimensional test problem from dispersive wave turbulence used here is computationally tractable yet is particularly difficult for filtering because of the non-Gaussian extreme event statistics and substantial small scale turbulence: a shallow energy spectrum proportional to k{sup −5/6} (where k is the wavenumber) results in two-thirds of the climatological variance being carried by the unresolved small scales. Because the unresolved scales contain so much energy, filters that ignore the representation error fail utterly to provide meaningful estimates of the system state. Inclusion of a time-independent climatological estimate of the representation error in a standard framework leads to inaccurate estimates of the large-scale part of the signal; accurate estimates of the large scales are only achieved by using stochastic superparameterization to provide evolving, large-scale dependent predictions of the small-scale statistics. Again, because the unresolved scales contain so much energy, even an accurate estimate of the large-scale part of the system does not provide an accurate estimate of the true state. By providing simultaneous estimates of both the large- and small-scale parts of the solution, the new framework is able to provide accurate estimates of the true system state.« less
Branscum, Paul; Sharma, Manoj
2014-01-01
The purpose of this study was to use the theory of planned behavior to explain two types of snack food consumption among boys and girls (girls n = 98; boys n = 69), which may have implications for future theory-based health promotion interventions. Between genders, there was a significant difference for calorie-dense/nutrient-poor snacks (p = .002), but no difference for fruit and vegetable snacks. Using stepwise multiple regression, attitudes, perceived behavioral control, and subjective norms accounted for a large amount of the variance of intentions (girls = 43.3%; boys = 55.9%); however, for girls, subjective norms accounted for the most variance, whereas for boys, attitudes accounted for the most variance. Calories from calorie-dense/nutrient-poor snacks and fruit and vegetable snacks were also predicted by intentions. For boys, intentions predicted 6.4% of the variance for fruit and vegetable snacks (p = .03) but was not significant for calorie-dense/nutrient-poor snacks, whereas for girls, intentions predicted 6.0% of the variance for fruit and vegetable snacks (p = .007), and 7.2% of the variance for calorie-dense/nutrient-poor snacks (p = .004). Results suggest that the theory of planned behavior is a useful framework for predicting snack foods among children; however, there are important differences between genders that should be considered in future health promotion interventions.
Liu, Xian; Engel, Charles C
2012-12-20
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Igne, Benoit; Shi, Zhenqi; Drennen, James K; Anderson, Carl A
2014-02-01
The impact of raw material variability on the prediction ability of a near-infrared calibration model was studied. Calibrations, developed from a quaternary mixture design comprising theophylline anhydrous, lactose monohydrate, microcrystalline cellulose, and soluble starch, were challenged by intentional variation of raw material properties. A design with two theophylline physical forms, three lactose particle sizes, and two starch manufacturers was created to test model robustness. Further challenges to the models were accomplished through environmental conditions. Along with full-spectrum partial least squares (PLS) modeling, variable selection by dynamic backward PLS and genetic algorithms was utilized in an effort to mitigate the effects of raw material variability. In addition to evaluating models based on their prediction statistics, prediction residuals were analyzed by analyses of variance and model diagnostics (Hotelling's T(2) and Q residuals). Full-spectrum models were significantly affected by lactose particle size. Models developed by selecting variables gave lower prediction errors and proved to be a good approach to limit the effect of changing raw material characteristics. Hotelling's T(2) and Q residuals provided valuable information that was not detectable when studying only prediction trends. Diagnostic statistics were demonstrated to be critical in the appropriate interpretation of the prediction of quality parameters. © 2013 Wiley Periodicals, Inc. and the American Pharmacists Association.
Hammer, Eva M.; Kaufmann, Tobias; Kleih, Sonja C.; Blankertz, Benjamin; Kübler, Andrea
2014-01-01
Modulation of sensorimotor rhythms (SMR) was suggested as a control signal for brain-computer interfaces (BCI). Yet, there is a population of users estimated between 10 to 50% not able to achieve reliable control and only about 20% of users achieve high (80–100%) performance. Predicting performance prior to BCI use would facilitate selection of the most feasible system for an individual, thus constitute a practical benefit for the user, and increase our knowledge about the correlates of BCI control. In a recent study, we predicted SMR-BCI performance from psychological variables that were assessed prior to the BCI sessions and BCI control was supported with machine-learning techniques. We described two significant psychological predictors, namely the visuo-motor coordination ability and the ability to concentrate on the task. The purpose of the current study was to replicate these results thereby validating these predictors within a neurofeedback based SMR-BCI that involved no machine learning.Thirty-three healthy BCI novices participated in a calibration session and three further neurofeedback training sessions. Two variables were related with mean SMR-BCI performance: (1) a measure for the accuracy of fine motor skills, i.e., a trade for a person’s visuo-motor control ability; and (2) subject’s “attentional impulsivity”. In a linear regression they accounted for almost 20% in variance of SMR-BCI performance, but predictor (1) failed significance. Nevertheless, on the basis of our prior regression model for sensorimotor control ability we could predict current SMR-BCI performance with an average prediction error of M = 12.07%. In more than 50% of the participants, the prediction error was smaller than 10%. Hence, psychological variables played a moderate role in predicting SMR-BCI performance in a neurofeedback approach that involved no machine learning. Future studies are needed to further consolidate (or reject) the present predictors. PMID:25147518
ERIC Educational Resources Information Center
Neel, John H.; Stallings, William M.
An influential statistics test recommends a Levene text for homogeneity of variance. A recent note suggests that Levene's test is upwardly biased for small samples. Another report shows inflated Alpha estimates and low power. Neither study utilized more than two sample sizes. This Monte Carlo study involved sampling from a normal population for…
Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis
ERIC Educational Resources Information Center
Marin-Martinez, Fulgencio; Sanchez-Meca, Julio
2010-01-01
Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…
The microcomputer scientific software series 3: general linear model--analysis of variance.
Harold M. Rauscher
1985-01-01
A BASIC language set of programs, designed for use on microcomputers, is presented. This set of programs will perform the analysis of variance for any statistical model describing either balanced or unbalanced designs. The program computes and displays the degrees of freedom, Type I sum of squares, and the mean square for the overall model, the error, and each factor...
Baeza-Baeza, J J; Ruiz-Angel, M J; García-Alvarez-Coque, M C
2007-09-07
A simple model is proposed that relates the parameters describing the peak width with the retention time, which can be easily predicted as a function of mobile phase composition. This allows the further prediction of peak shape with global errors below 5%, using a modified Gaussian model with a parabolic variance. The model is useful in the optimisation of chromatographic resolution to assess an eventual overlapping of close peaks. The dependence of peak shape with mobile phase composition was studied for mobile phases containing acetonitrile in the presence and absence of micellised surfactant (micellar-organic and hydro-organic reversed-phase liquid chromatography, RPLC). In micellar RPLC, both modifiers (surfactant and acetonitrile) were observed to decrease or improve the efficiencies in the same percentage, at least in the studied concentration ranges. The study also revealed that the problem of achieving smaller efficiencies in this chromatographic mode, compared to hydro-organic RPLC, is not only related to the presence of surfactant covering the stationary phase, but also to the smaller concentration of organic solvent in the mobile phase.
Use of vegetation health data for estimation of aus rice yield in bangladesh.
Rahman, Atiqur; Roytman, Leonid; Krakauer, Nir Y; Nizamuddin, Mohammad; Goldberg, Mitch
2009-01-01
Rice is a vital staple crop for Bangladesh and surrounding countries, with interannual variation in yields depending on climatic conditions. We compared Bangladesh yield of aus rice, one of the main varieties grown, from official agricultural statistics with Vegetation Health (VH) Indices [Vegetation Condition Index (VCI), Temperature Condition Index (TCI) and Vegetation Health Index (VHI)] computed from Advanced Very High Resolution Radiometer (AVHRR) data covering a period of 15 years (1991-2005). A strong correlation was found between aus rice yield and VCI and VHI during the critical period of aus rice development that occurs during March-April (weeks 8-13 of the year), several months in advance of the rice harvest. Stepwise principal component regression (PCR) was used to construct a model to predict yield as a function of critical-period VHI. The model reduced the yield prediction error variance by 62% compared with a prediction of average yield for each year. Remote sensing is a valuable tool for estimating rice yields well in advance of harvest and at a low cost.
Use of Vegetation Health Data for Estimation of Aus Rice Yield in Bangladesh
Rahman, Atiqur; Roytman, Leonid; Krakauer, Nir Y.; Nizamuddin, Mohammad; Goldberg, Mitch
2009-01-01
Rice is a vital staple crop for Bangladesh and surrounding countries, with interannual variation in yields depending on climatic conditions. We compared Bangladesh yield of aus rice, one of the main varieties grown, from official agricultural statistics with Vegetation Health (VH) Indices [Vegetation Condition Index (VCI), Temperature Condition Index (TCI) and Vegetation Health Index (VHI)] computed from Advanced Very High Resolution Radiometer (AVHRR) data covering a period of 15 years (1991–2005). A strong correlation was found between aus rice yield and VCI and VHI during the critical period of aus rice development that occurs during March–April (weeks 8–13 of the year), several months in advance of the rice harvest. Stepwise principal component regression (PCR) was used to construct a model to predict yield as a function of critical-period VHI. The model reduced the yield prediction error variance by 62% compared with a prediction of average yield for each year. Remote sensing is a valuable tool for estimating rice yields well in advance of harvest and at a low cost. PMID:22574057
Optical injection phase-lock loops
NASA Astrophysics Data System (ADS)
Bordonalli, Aldario Chrestani
Locking techniques have been widely applied for frequency synchronisation of semiconductor lasers used in coherent communication and microwave signal generation systems. Two main locking techniques, the optical phase-lock loop (OPLL) and optical injection locking (OIL) are analysed in this thesis. The principal limitations on OPLL performance result from the loop propagation delay, which makes difficult the implementation of high gain and wide bandwidth loops, leading to poor phase noise suppression performance and requiring the linewidths of the semiconductor laser sources to be less than a few megahertz for practical values of loop delay. The OIL phase noise suppression is controlled by the injected power. The principal limitations of the OIL implementation are the finite phase error under locked conditions and the narrow stable locking range the system provides at injected power levels required to reduce the phase noise output of semiconductor lasers significantly. This thesis demonstrates theoretically and experimentally that it is possible to overcome the limitations of OPLL and OIL systems by combining them, to form an optical injection phase-lock loop (OIPLL). The modelling of an OIPLL system is presented and compared with the equivalent OPLL and OIL results. Optical and electrical design of an homodyne OIPLL is detailed. Experimental results are given which verify the theoretical prediction that the OIPLL would keep the phase noise suppression as high as that of the OIL system over a much wider stable locking range, even with wide linewidth lasers and long loop delays. The experimental results for lasers with summed linewidth of 36 MHz and a loop delay of 15 ns showed measured phase error variances as low as 0.006 rad2 (500 MHz bandwidth) for locking bandwidths greater than 26 GHz, compared with the equivalent OPLL phase error variance of around 1 rad2 (500 MHz bandwidth) and the equivalent OIL locking bandwidth of less than 1.2 GHz.
Linear and Order Statistics Combiners for Pattern Classification
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)
2001-01-01
Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.
Impact of Damping Uncertainty on SEA Model Response Variance
NASA Technical Reports Server (NTRS)
Schiller, Noah; Cabell, Randolph; Grosveld, Ferdinand
2010-01-01
Statistical Energy Analysis (SEA) is commonly used to predict high-frequency vibroacoustic levels. This statistical approach provides the mean response over an ensemble of random subsystems that share the same gross system properties such as density, size, and damping. Recently, techniques have been developed to predict the ensemble variance as well as the mean response. However these techniques do not account for uncertainties in the system properties. In the present paper uncertainty in the damping loss factor is propagated through SEA to obtain more realistic prediction bounds that account for both ensemble and damping variance. The analysis is performed on a floor-equipped cylindrical test article that resembles an aircraft fuselage. Realistic bounds on the damping loss factor are determined from measurements acquired on the sidewall of the test article. The analysis demonstrates that uncertainties in damping have the potential to significantly impact the mean and variance of the predicted response.
Crop area estimation based on remotely-sensed data with an accurate but costly subsample
NASA Technical Reports Server (NTRS)
Gunst, R. F.
1983-01-01
Alternatives to sampling-theory stratified and regression estimators of crop production and timber biomass were examined. An alternative estimator which is viewed as especially promising is the errors-in-variable regression estimator. Investigations established the need for caution with this estimator when the ratio of two error variances is not precisely known.
A General Approach to Defining Latent Growth Components
ERIC Educational Resources Information Center
Mayer, Axel; Steyer, Rolf; Mueller, Horst
2012-01-01
We present a 3-step approach to defining latent growth components. In the first step, a measurement model with at least 2 indicators for each time point is formulated to identify measurement error variances and obtain latent variables that are purged from measurement error. In the second step, we use contrast matrices to define the latent growth…
Utility functions predict variance and skewness risk preferences in monkeys
Genest, Wilfried; Stauffer, William R.; Schultz, Wolfram
2016-01-01
Utility is the fundamental variable thought to underlie economic choices. In particular, utility functions are believed to reflect preferences toward risk, a key decision variable in many real-life situations. To assess the validity of utility representations, it is therefore important to examine risk preferences. In turn, this approach requires formal definitions of risk. A standard approach is to focus on the variance of reward distributions (variance-risk). In this study, we also examined a form of risk related to the skewness of reward distributions (skewness-risk). Thus, we tested the extent to which empirically derived utility functions predicted preferences for variance-risk and skewness-risk in macaques. The expected utilities calculated for various symmetrical and skewed gambles served to define formally the direction of stochastic dominance between gambles. In direct choices, the animals’ preferences followed both second-order (variance) and third-order (skewness) stochastic dominance. Specifically, for gambles with different variance but identical expected values (EVs), the monkeys preferred high-variance gambles at low EVs and low-variance gambles at high EVs; in gambles with different skewness but identical EVs and variances, the animals preferred positively over symmetrical and negatively skewed gambles in a strongly transitive fashion. Thus, the utility functions predicted the animals’ preferences for variance-risk and skewness-risk. Using these well-defined forms of risk, this study shows that monkeys’ choices conform to the internal reward valuations suggested by their utility functions. This result implies a representation of utility in monkeys that accounts for both variance-risk and skewness-risk preferences. PMID:27402743
Utility functions predict variance and skewness risk preferences in monkeys.
Genest, Wilfried; Stauffer, William R; Schultz, Wolfram
2016-07-26
Utility is the fundamental variable thought to underlie economic choices. In particular, utility functions are believed to reflect preferences toward risk, a key decision variable in many real-life situations. To assess the validity of utility representations, it is therefore important to examine risk preferences. In turn, this approach requires formal definitions of risk. A standard approach is to focus on the variance of reward distributions (variance-risk). In this study, we also examined a form of risk related to the skewness of reward distributions (skewness-risk). Thus, we tested the extent to which empirically derived utility functions predicted preferences for variance-risk and skewness-risk in macaques. The expected utilities calculated for various symmetrical and skewed gambles served to define formally the direction of stochastic dominance between gambles. In direct choices, the animals' preferences followed both second-order (variance) and third-order (skewness) stochastic dominance. Specifically, for gambles with different variance but identical expected values (EVs), the monkeys preferred high-variance gambles at low EVs and low-variance gambles at high EVs; in gambles with different skewness but identical EVs and variances, the animals preferred positively over symmetrical and negatively skewed gambles in a strongly transitive fashion. Thus, the utility functions predicted the animals' preferences for variance-risk and skewness-risk. Using these well-defined forms of risk, this study shows that monkeys' choices conform to the internal reward valuations suggested by their utility functions. This result implies a representation of utility in monkeys that accounts for both variance-risk and skewness-risk preferences.
Prediction of true test scores from observed item scores and ancillary data.
Haberman, Shelby J; Yao, Lili; Sinharay, Sandip
2015-05-01
In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Paul, David R; McGrath, Ryan; Vella, Chantal A; Kramer, Matthew; Baer, David J; Moshfegh, Alanna J
2018-03-26
The National Health and Nutrition Examination Survey physical activity questionnaire (PAQ) is used to estimate activity energy expenditure (AEE) and moderate to vigorous physical activity (MVPA). Bias and variance in estimates of AEE and MVPA from the PAQ have not been described, nor the impact of measurement error when utilizing the PAQ to predict biomarkers and categorize individuals. The PAQ was administered to 385 adults to estimate AEE (AEE:PAQ) and MVPA (MVPA:PAQ), while simultaneously measuring AEE with doubly labeled water (DLW; AEE:DLW) and MVPA with an accelerometer (MVPA:A). Although AEE:PAQ [3.4 (2.2) MJ·d -1 ] was not significantly different from AEE:DLW [3.6 (1.6) MJ·d -1 ; P > .14], MVPA:PAQ [36.2 (24.4) min·d -1 ] was significantly higher than MVPA:A [8.0 (10.4) min·d -1 ; P < .0001]. AEE:PAQ regressed on AEE:DLW and MVPA:PAQ regressed on MVPA:A yielded not only significant positive relationships but also large residual variances. The relationships between AEE and MVPA, and 10 of the 12 biomarkers were underestimated by the PAQ. When compared with accelerometers, the PAQ overestimated the number of participants who met the Physical Activity Guidelines for Americans. Group-level bias in AEE:PAQ was small, but large for MVPA:PAQ. Poor within-participant estimates of AEE:PAQ and MVPA:PAQ lead to attenuated relationships with biomarkers and misclassifications of participants who met or who did not meet the Physical Activity Guidelines for Americans.
Effects of low sampling rate in the digital data-transition tracking loop
NASA Technical Reports Server (NTRS)
Mileant, A.; Million, S.; Hinedi, S.
1994-01-01
This article describes the performance of the all-digital data-transition tracking loop (DTTL) with coherent and noncoherent sampling using nonlinear theory. The effects of few samples per symbol and of noncommensurate sampling and symbol rates are addressed and analyzed. Their impact on the probability density and variance of the phase error are quantified through computer simulations. It is shown that the performance of the all-digital DTTL approaches its analog counterpart when the sampling and symbol rates are noncommensurate (i.e., the number of samples per symbol is an irrational number). The loop signal-to-noise ratio (SNR) (inverse of phase error variance) degrades when the number of samples per symbol is an odd integer but degrades even further for even integers.
A new stratification of mourning dove call-count routes
Blankenship, L.H.; Humphrey, A.B.; MacDonald, D.
1971-01-01
The mourning dove (Zenaidura macroura) call-count survey is a nationwide audio-census of breeding mourning doves. Recent analyses of the call-count routes have utilized a stratification based upon physiographic regions of the United States. An analysis of 5 years of call-count data, based upon stratification using potential natural vegetation, has demonstrated that this uew stratification results in strata with greater homogeneity than the physiographic strata, provides lower error variance, and hence generates greatet precision in the analysis without an increase in call-count routes. Error variance was reduced approximately 30 percent for the contiguous United States. This indicates that future analysis based upon the new stratification will result in an increased ability to detect significant year-to-year changes.
Technical note: Application of the Box-Cox data transformation to animal science experiments.
Peltier, M R; Wilcox, C J; Sharp, D C
1998-03-01
In the use of ANOVA for hypothesis testing in animal science experiments, the assumption of homogeneity of errors often is violated because of scale effects and the nature of the measurements. We demonstrate a method for transforming data so that the assumptions of ANOVA are met (or violated to a lesser degree) and apply it in analysis of data from a physiology experiment. Our study examined whether melatonin implantation would affect progesterone secretion in cycling pony mares. Overall treatment variances were greater in the melatonin-treated group, and several common transformation procedures failed. Application of the Box-Cox transformation algorithm reduced the heterogeneity of error and permitted the assumption of equal variance to be met.
Decorrelation of the true and estimated classifier errors in high-dimensional settings.
Hanczar, Blaise; Hua, Jianping; Dougherty, Edward R
2007-01-01
The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error), in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting). Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, k-fold cross-validation, and .632 bootstrap). Moreover, three scenarios are considered: (1) feature selection, (2) known-feature set, and (3) all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection or using all features, with the better correlation between the latter two showing no general trend, but differing for different models.
NASA Astrophysics Data System (ADS)
Hansen, S. K.; Haslauer, C. P.; Cirpka, O. A.; Vesselinov, V. V.
2016-12-01
It is desirable to predict the shape of breakthrough curves downgradient of a solute source from subsurface structural parameters (as in the small-perturbation macrodispersion theory) both for realistically heterogeneous fields, and at early time, before any sort of Fickian model is applicable. Using a combination of a priori knowledge, large-scale Monte Carlo simulation, and regression techniques, we have developed closed-form predictive expressions for pre- and post-Fickian flux-weighted solute breakthrough curves as a function of distance from the source (in integral scales) and variance of the log hydraulic conductivity field. Using the ensemble of Monte Carlo realizations, we have simultaneously computed error envelopes for the estimated flux-weighted breakthrough, and for the divergence of point breakthrough curves from the flux-weighted average, as functions of the predictive parameters. We have also obtained implied late-time macrodispersion coefficients for highly heterogeneous environments from the breakthrough statistics. This analysis is relevant for the modelling of reactive as well as conservative transport, since for many kinetic sorption and decay reactions, Laplace-domain modification of the breakthrough curve for conservative solute produces the correct curve for the reactive system.
Zhou, Yan; Cao, Hui
2013-01-01
We propose an augmented classical least squares (ACLS) calibration method for quantitative Raman spectral analysis against component information loss. The Raman spectral signals with low analyte concentration correlations were selected and used as the substitutes for unknown quantitative component information during the CLS calibration procedure. The number of selected signals was determined by using the leave-one-out root-mean-square error of cross-validation (RMSECV) curve. An ACLS model was built based on the augmented concentration matrix and the reference spectral signal matrix. The proposed method was compared with partial least squares (PLS) and principal component regression (PCR) using one example: a data set recorded from an experiment of analyte concentration determination using Raman spectroscopy. A 2-fold cross-validation with Venetian blinds strategy was exploited to evaluate the predictive power of the proposed method. The one-way variance analysis (ANOVA) was used to access the predictive power difference between the proposed method and existing methods. Results indicated that the proposed method is effective at increasing the robust predictive power of traditional CLS model against component information loss and its predictive power is comparable to that of PLS or PCR.
Dopamine reward prediction error coding.
Schultz, Wolfram
2016-03-01
Reward prediction errors consist of the differences between received and predicted rewards. They are crucial for basic forms of learning about rewards and make us strive for more rewards-an evolutionary beneficial trait. Most dopamine neurons in the midbrain of humans, monkeys, and rodents signal a reward prediction error; they are activated by more reward than predicted (positive prediction error), remain at baseline activity for fully predicted rewards, and show depressed activity with less reward than predicted (negative prediction error). The dopamine signal increases nonlinearly with reward value and codes formal economic utility. Drugs of addiction generate, hijack, and amplify the dopamine reward signal and induce exaggerated, uncontrolled dopamine effects on neuronal plasticity. The striatum, amygdala, and frontal cortex also show reward prediction error coding, but only in subpopulations of neurons. Thus, the important concept of reward prediction errors is implemented in neuronal hardware.
Dopamine reward prediction error coding
Schultz, Wolfram
2016-01-01
Reward prediction errors consist of the differences between received and predicted rewards. They are crucial for basic forms of learning about rewards and make us strive for more rewards—an evolutionary beneficial trait. Most dopamine neurons in the midbrain of humans, monkeys, and rodents signal a reward prediction error; they are activated by more reward than predicted (positive prediction error), remain at baseline activity for fully predicted rewards, and show depressed activity with less reward than predicted (negative prediction error). The dopamine signal increases nonlinearly with reward value and codes formal economic utility. Drugs of addiction generate, hijack, and amplify the dopamine reward signal and induce exaggerated, uncontrolled dopamine effects on neuronal plasticity. The striatum, amygdala, and frontal cortex also show reward prediction error coding, but only in subpopulations of neurons. Thus, the important concept of reward prediction errors is implemented in neuronal hardware. PMID:27069377
Lu, Xinjiang; Liu, Wenbo; Zhou, Chuang; Huang, Minghui
2017-06-13
The least-squares support vector machine (LS-SVM) is a popular data-driven modeling method and has been successfully applied to a wide range of applications. However, it has some disadvantages, including being ineffective at handling non-Gaussian noise as well as being sensitive to outliers. In this paper, a robust LS-SVM method is proposed and is shown to have more reliable performance when modeling a nonlinear system under conditions where Gaussian or non-Gaussian noise is present. The construction of a new objective function allows for a reduction of the mean of the modeling error as well as the minimization of its variance, and it does not constrain the mean of the modeling error to zero. This differs from the traditional LS-SVM, which uses a worst-case scenario approach in order to minimize the modeling error and constrains the mean of the modeling error to zero. In doing so, the proposed method takes the modeling error distribution information into consideration and is thus less conservative and more robust in regards to random noise. A solving method is then developed in order to determine the optimal parameters for the proposed robust LS-SVM. An additional analysis indicates that the proposed LS-SVM gives a smaller weight to a large-error training sample and a larger weight to a small-error training sample, and is thus more robust than the traditional LS-SVM. The effectiveness of the proposed robust LS-SVM is demonstrated using both artificial and real life cases.
Makeyev, Oleksandr; Joe, Cody; Lee, Colin; Besio, Walter G
2017-07-01
Concentric ring electrodes have shown promise in non-invasive electrophysiological measurement demonstrating their superiority to conventional disc electrodes, in particular, in accuracy of Laplacian estimation. Recently, we have proposed novel variable inter-ring distances concentric ring electrodes. Analytic and finite element method modeling results for linearly increasing distances electrode configurations suggested they may decrease the truncation error resulting in more accurate Laplacian estimates compared to currently used constant inter-ring distances configurations. This study assesses statistical significance of Laplacian estimation accuracy improvement due to novel variable inter-ring distances concentric ring electrodes. Full factorial design of analysis of variance was used with one categorical and two numerical factors: the inter-ring distances, the electrode diameter, and the number of concentric rings in the electrode. The response variables were the Relative Error and the Maximum Error of Laplacian estimation computed using a finite element method model for each of the combinations of levels of three factors. Effects of the main factors and their interactions on Relative Error and Maximum Error were assessed and the obtained results suggest that all three factors have statistically significant effects in the model confirming the potential of using inter-ring distances as a means of improving accuracy of Laplacian estimation.
Location tests for biomarker studies: a comparison using simulations for the two-sample case.
Scheinhardt, M O; Ziegler, A
2013-01-01
Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O'Gorman, Can J Stat 1997; 25: 269 -279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267- 293; Szymczak et al., Stat Med 2013; 32: 524 - 537] for a wide range of distributions. We simulated two-sample scenarios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test performed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.
Gonçalves, Fabio; Treuhaft, Robert; Law, Beverly; ...
2017-01-07
Mapping and monitoring of forest carbon stocks across large areas in the tropics will necessarily rely on remote sensing approaches, which in turn depend on field estimates of biomass for calibration and validation purposes. Here, we used field plot data collected in a tropical moist forest in the central Amazon to gain a better understanding of the uncertainty associated with plot-level biomass estimates obtained specifically for the calibration of remote sensing measurements. In addition to accounting for sources of error that would be normally expected in conventional biomass estimates (e.g., measurement and allometric errors), we examined two sources of uncertaintymore » that are specific to the calibration process and should be taken into account in most remote sensing studies: the error resulting from spatial disagreement between field and remote sensing measurements (i.e., co-location error), and the error introduced when accounting for temporal differences in data acquisition. We found that the overall uncertainty in the field biomass was typically 25% for both secondary and primary forests, but ranged from 16 to 53%. Co-location and temporal errors accounted for a large fraction of the total variance (>65%) and were identified as important targets for reducing uncertainty in studies relating tropical forest biomass to remotely sensed data. Although measurement and allometric errors were relatively unimportant when considered alone, combined they accounted for roughly 30% of the total variance on average and should not be ignored. Lastly, our results suggest that a thorough understanding of the sources of error associated with field-measured plot-level biomass estimates in tropical forests is critical to determine confidence in remote sensing estimates of carbon stocks and fluxes, and to develop strategies for reducing the overall uncertainty of remote sensing approaches.« less
Suboptimal schemes for atmospheric data assimilation based on the Kalman filter
NASA Technical Reports Server (NTRS)
Todling, Ricardo; Cohn, Stephen E.
1994-01-01
This work is directed toward approximating the evolution of forecast error covariances for data assimilation. The performance of different algorithms based on simplification of the standard Kalman filter (KF) is studied. These are suboptimal schemes (SOSs) when compared to the KF, which is optimal for linear problems with known statistics. The SOSs considered here are several versions of optimal interpolation (OI), a scheme for height error variance advection, and a simplified KF in which the full height error covariance is advected. To employ a methodology for exact comparison among these schemes, a linear environment is maintained, in which a beta-plane shallow-water model linearized about a constant zonal flow is chosen for the test-bed dynamics. The results show that constructing dynamically balanced forecast error covariances rather than using conventional geostrophically balanced ones is essential for successful performance of any SOS. A posteriori initialization of SOSs to compensate for model - data imbalance sometimes results in poor performance. Instead, properly constructed dynamically balanced forecast error covariances eliminate the need for initialization. When the SOSs studied here make use of dynamically balanced forecast error covariances, the difference among their performances progresses naturally from conventional OI to the KF. In fact, the results suggest that even modest enhancements of OI, such as including an approximate dynamical equation for height error variances while leaving height error correlation structure homogeneous, go a long way toward achieving the performance of the KF, provided that dynamically balanced cross-covariances are constructed and that model errors are accounted for properly. The results indicate that such enhancements are necessary if unconventional data are to have a positive impact.
Rawlins, B G; Scheib, C; Tyler, A N; Beamish, D
2012-12-01
Regulatory authorities need ways to estimate natural terrestrial gamma radiation dose rates (nGy h⁻¹) across the landscape accurately, to assess its potential deleterious health effects. The primary method for estimating outdoor dose rate is to use an in situ detector supported 1 m above the ground, but such measurements are costly and cannot capture the landscape-scale variation in dose rates which are associated with changes in soil and parent material mineralogy. We investigate the potential for improving estimates of terrestrial gamma dose rates across Northern Ireland (13,542 km²) using measurements from 168 sites and two sources of ancillary data: (i) a map based on a simplified classification of soil parent material, and (ii) dose estimates from a national-scale, airborne radiometric survey. We used the linear mixed modelling framework in which the two ancillary variables were included in separate models as fixed effects, plus a correlation structure which captures the spatially correlated variance component. We used a cross-validation procedure to determine the magnitude of the prediction errors for the different models. We removed a random subset of 10 terrestrial measurements and formed the model from the remainder (n = 158), and then used the model to predict values at the other 10 sites. We repeated this procedure 50 times. The measurements of terrestrial dose vary between 1 and 103 (nGy h⁻¹). The median absolute model prediction errors (nGy h⁻¹) for the three models declined in the following order: no ancillary data (10.8) > simple geological classification (8.3) > airborne radiometric dose (5.4) as a single fixed effect. Estimates of airborne radiometric gamma dose rate can significantly improve the spatial prediction of terrestrial dose rate.
Bai, Mingsian R; Hsieh, Ping-Ju; Hur, Kur-Nan
2009-02-01
The performance of the minimum mean-square error noise reduction (MMSE-NR) algorithm in conjunction with time-recursive averaging (TRA) for noise estimation is found to be very sensitive to the choice of two recursion parameters. To address this problem in a more systematic manner, this paper proposes an optimization method to efficiently search the optimal parameters of the MMSE-TRA-NR algorithms. The objective function is based on a regression model, whereas the optimization process is carried out with the simulated annealing algorithm that is well suited for problems with many local optima. Another NR algorithm proposed in the paper employs linear prediction coding as a preprocessor for extracting the correlated portion of human speech. Objective and subjective tests were undertaken to compare the optimized MMSE-TRA-NR algorithm with several conventional NR algorithms. The results of subjective tests were processed by using analysis of variance to justify the statistic significance. A post hoc test, Tukey's Honestly Significant Difference, was conducted to further assess the pairwise difference between the NR algorithms.
On the reach of perturbative descriptions for dark matter displacement fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baldauf, Tobias; Zaldarriaga, Matias; Schaan, Emmanuel, E-mail: baldauf@ias.edu, E-mail: eschaan@astro.princeton.edu, E-mail: matiasz@ias.edu
We study Lagrangian Perturbation Theory (LPT) and its regularization in the Effective Field Theory (EFT) approach. We evaluate the LPT displacement with the same phases as a corresponding N-body simulation, which allows us to compare perturbation theory to the non-linear simulation with significantly reduced cosmic variance, and provides a more stringent test than simply comparing power spectra. We reliably detect a non-vanishing leading order EFT coefficient and a stochastic displacement term, uncorrelated with the LPT terms. This stochastic term is expected in the EFT framework, and, to the best of our understanding, is not an artifact of numerical errors ormore » transients in our simulations. This term constitutes a limit to the accuracy of perturbative descriptions of the displacement field and its phases, corresponding to a 1% error on the non-linear power spectrum at k = 0.2 h{sup −1}Mpc at z = 0. Predicting the displacement power spectrum to higher accuracy or larger wavenumbers thus requires a model for the stochastic displacement.« less
2013-01-01
Background Measures used for medical student selection should predict future performance during training. A problem for any selection study is that predictor-outcome correlations are known only in those who have been selected, whereas selectors need to know how measures would predict in the entire pool of applicants. That problem of interpretation can be solved by calculating construct-level predictive validity, an estimate of true predictor-outcome correlation across the range of applicant abilities. Methods Construct-level predictive validities were calculated in six cohort studies of medical student selection and training (student entry, 1972 to 2009) for a range of predictors, including A-levels, General Certificates of Secondary Education (GCSEs)/O-levels, and aptitude tests (AH5 and UK Clinical Aptitude Test (UKCAT)). Outcomes included undergraduate basic medical science and finals assessments, as well as postgraduate measures of Membership of the Royal Colleges of Physicians of the United Kingdom (MRCP(UK)) performance and entry in the Specialist Register. Construct-level predictive validity was calculated with the method of Hunter, Schmidt and Le (2006), adapted to correct for right-censorship of examination results due to grade inflation. Results Meta-regression analyzed 57 separate predictor-outcome correlations (POCs) and construct-level predictive validities (CLPVs). Mean CLPVs are substantially higher (.450) than mean POCs (.171). Mean CLPVs for first-year examinations, were high for A-levels (.809; CI: .501 to .935), and lower for GCSEs/O-levels (.332; CI: .024 to .583) and UKCAT (mean = .245; CI: .207 to .276). A-levels had higher CLPVs for all undergraduate and postgraduate assessments than did GCSEs/O-levels and intellectual aptitude tests. CLPVs of educational attainment measures decline somewhat during training, but continue to predict postgraduate performance. Intellectual aptitude tests have lower CLPVs than A-levels or GCSEs/O-levels. Conclusions Educational attainment has strong CLPVs for undergraduate and postgraduate performance, accounting for perhaps 65% of true variance in first year performance. Such CLPVs justify the use of educational attainment measure in selection, but also raise a key theoretical question concerning the remaining 35% of variance (and measurement error, range restriction and right-censorship have been taken into account). Just as in astrophysics, ‘dark matter’ and ‘dark energy’ are posited to balance various theoretical equations, so medical student selection must also have its ‘dark variance’, whose nature is not yet properly characterized, but explains a third of the variation in performance during training. Some variance probably relates to factors which are unpredictable at selection, such as illness or other life events, but some is probably also associated with factors such as personality, motivation or study skills. PMID:24229353
Systems Engineering Programmatic Estimation Using Technology Variance
NASA Technical Reports Server (NTRS)
Mog, Robert A.
2000-01-01
Unique and innovative system programmatic estimation is conducted using the variance of the packaged technologies. Covariance analysis is performed on the subsystems and components comprising the system of interest. Technological "return" and "variation" parameters are estimated. These parameters are combined with the model error to arrive at a measure of system development stability. The resulting estimates provide valuable information concerning the potential cost growth of the system under development.
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
Beasley, T. Mark
2013-01-01
Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952
Rast, Philippe; Hofer, Scott M.
2014-01-01
We investigated the power to detect variances and covariances in rates of change in the context of existing longitudinal studies using linear bivariate growth curve models. Power was estimated by means of Monte Carlo simulations. Our findings show that typical longitudinal study designs have substantial power to detect both variances and covariances among rates of change in a variety of cognitive, physical functioning, and mental health outcomes. We performed simulations to investigate the interplay among number and spacing of occasions, total duration of the study, effect size, and error variance on power and required sample size. The relation between growth rate reliability (GRR) and effect size to the sample size required to detect power ≥ .80 was non-linear, with rapidly decreasing sample sizes needed as GRR increases. The results presented here stand in contrast to previous simulation results and recommendations (Hertzog, Lindenberger, Ghisletta, & von Oertzen, 2006; Hertzog, von Oertzen, Ghisletta, & Lindenberger, 2008; von Oertzen, Ghisletta, & Lindenberger, 2010), which are limited due to confounds between study length and number of waves, error variance with GCR, and parameter values which are largely out of bounds of actual study values. Power to detect change is generally low in the early phases (i.e. first years) of longitudinal studies but can substantially increase if the design is optimized. We recommend additional assessments, including embedded intensive measurement designs, to improve power in the early phases of long-term longitudinal studies. PMID:24219544
Stress in junior enlisted air force women with and without children.
Hopkins-Chadwick, Denise L; Ryan-Wenger, Nancy
2009-04-01
The objective was to determine if there are differences between young enlisted military women with and without preschool children on role strain, stress, health, and military career aspiration and to identify the best predictors of these variables. The study used a cross-sectional descriptive design of 50 junior Air Force women with preschool children and 50 women without children. There were no differences between women with and without children in role strain, stress, health, and military career aspiration. In all women, higher stress was moderately predictive of higher role strain (39.9% of variance explained) but a poor predictor of career aspiration (3.8% of variance explained). Lower mental health scores were predicted by high stress symptoms (27.9% of variance explained), low military career aspiration (4.1% of variance explained), high role strain (4.0% of variance explained), and being non-White (3.9% of variance explained). Aspiration for a military career was predicted by high perceived availability of military resources (16.8% of variance explained), low family of origin socioeconomic status (4.5% of variance explained), and better mental health status (3.3% of variance explained). Contrary to theoretical expectations, in this sample, motherhood was not a significant variable. Increased role strain, stress, and decreased health as well as decreased military career aspiration were evident in both groups and may have more to do with individual coping skills and other unmeasured resources. More research is needed to determine what nursing interventions are needed to best support both groups of women.
Increasing point-count duration increases standard error
Smith, W.P.; Twedt, D.J.; Hamel, P.B.; Ford, R.P.; Wiedenfeld, D.A.; Cooper, R.J.
1998-01-01
We examined data from point counts of varying duration in bottomland forests of west Tennessee and the Mississippi Alluvial Valley to determine if counting interval influenced sampling efficiency. Estimates of standard error increased as point count duration increased both for cumulative number of individuals and species in both locations. Although point counts appear to yield data with standard errors proportional to means, a square root transformation of the data may stabilize the variance. Using long (>10 min) point counts may reduce sample size and increase sampling error, both of which diminish statistical power and thereby the ability to detect meaningful changes in avian populations.
Bolandzadeh, Niousha; Kording, Konrad; Salowitz, Nicole; Davis, Jennifer C; Hsu, Liang; Chan, Alison; Sharma, Devika; Blohm, Gunnar; Liu-Ambrose, Teresa
2015-01-01
Current research suggests that the neuropathology of dementia-including brain changes leading to memory impairment and cognitive decline-is evident years before the onset of this disease. Older adults with cognitive decline have reduced functional independence and quality of life, and are at greater risk for developing dementia. Therefore, identifying biomarkers that can be easily assessed within the clinical setting and predict cognitive decline is important. Early recognition of cognitive decline could promote timely implementation of preventive strategies. We included 89 community-dwelling adults aged 70 years and older in our study, and collected 32 measures of physical function, health status and cognitive function at baseline. We utilized an L1-L2 regularized regression model (elastic net) to identify which of the 32 baseline measures were strongly predictive of cognitive function after one year. We built three linear regression models: 1) based on baseline cognitive function, 2) based on variables consistently selected in every cross-validation loop, and 3) a full model based on all the 32 variables. Each of these models was carefully tested with nested cross-validation. Our model with the six variables consistently selected in every cross-validation loop had a mean squared prediction error of 7.47. This number was smaller than that of the full model (115.33) and the model with baseline cognitive function (7.98). Our model explained 47% of the variance in cognitive function after one year. We built a parsimonious model based on a selected set of six physical function and health status measures strongly predictive of cognitive function after one year. In addition to reducing the complexity of the model without changing the model significantly, our model with the top variables improved the mean prediction error and R-squared. These six physical function and health status measures can be easily implemented in a clinical setting.
NASA Technical Reports Server (NTRS)
Amling, G. E.; Holms, A. G.
1973-01-01
A computer program is described that performs a statistical multiple-decision procedure called chain pooling. It uses a number of mean squares assigned to error variance that is conditioned on the relative magnitudes of the mean squares. The model selection is done according to user-specified levels of type 1 or type 2 error probabilities.
ERIC Educational Resources Information Center
Longford, Nicholas T.
Large scale surveys usually employ a complex sampling design and as a consequence, no standard methods for estimation of the standard errors associated with the estimates of population means are available. Resampling methods, such as jackknife or bootstrap, are often used, with reference to their properties of robustness and reduction of bias. A…
Design of a breath analysis system for diabetes screening and blood glucose level prediction.
Yan, Ke; Zhang, David; Wu, Darong; Wei, Hua; Lu, Guangming
2014-11-01
It has been reported that concentrations of several biomarkers in diabetics' breath show significant difference from those in healthy people's breath. Concentrations of some biomarkers are also correlated with the blood glucose levels (BGLs) of diabetics. Therefore, it is possible to screen for diabetes and predict BGLs by analyzing one's breath. In this paper, we describe the design of a novel breath analysis system for this purpose. The system uses carefully selected chemical sensors to detect biomarkers in breath. Common interferential factors, including humidity and the ratio of alveolar air in breath, are compensated or handled in the algorithm. Considering the intersubject variance of the components in breath, we build subject-specific prediction models to improve the accuracy of BGL prediction. A total of 295 breath samples from healthy subjects and 279 samples from diabetic subjects were collected to evaluate the performance of the system. The sensitivity and specificity of diabetes screening are 91.51% and 90.77%, respectively. The mean relative absolute error for BGL prediction is 21.7%. Experiments show that the system is effective and that the strategies adopted in the system can improve its accuracy. The system potentially provides a noninvasive and convenient method for diabetes screening and BGL monitoring as an adjunct to the standard criteria.
Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation
NASA Astrophysics Data System (ADS)
Räisänen, Petri; Barker, W. Howard
2004-07-01
The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.
Wang, Li-Pen; Ochoa-Rodríguez, Susana; Simões, Nuno Eduardo; Onof, Christian; Maksimović, Cedo
2013-01-01
The applicability of the operational radar and raingauge networks for urban hydrology is insufficient. Radar rainfall estimates provide a good description of the spatiotemporal variability of rainfall; however, their accuracy is in general insufficient. It is therefore necessary to adjust radar measurements using raingauge data, which provide accurate point rainfall information. Several gauge-based radar rainfall adjustment techniques have been developed and mainly applied at coarser spatial and temporal scales; however, their suitability for small-scale urban hydrology is seldom explored. In this paper a review of gauge-based adjustment techniques is first provided. After that, two techniques, respectively based upon the ideas of mean bias reduction and error variance minimisation, were selected and tested using as case study an urban catchment (∼8.65 km(2)) in North-East London. The radar rainfall estimates of four historical events (2010-2012) were adjusted using in situ raingauge estimates and the adjusted rainfall fields were applied to the hydraulic model of the study area. The results show that both techniques can effectively reduce mean bias; however, the technique based upon error variance minimisation can in general better reproduce the spatial and temporal variability of rainfall, which proved to have a significant impact on the subsequent hydraulic outputs. This suggests that error variance minimisation based methods may be more appropriate for urban-scale hydrological applications.
NASA Astrophysics Data System (ADS)
Hartanto, R.; Jantra, M. A. C.; Santosa, S. A. B.; Purnomoadi, A.
2018-01-01
The purpose of this research was to find an appropriate relationship model between the feed energy and protein ratio with the amount of production and quality of milk proteins. This research was conducted at Getasan Sub-district, Semarang Regency, Central Java Province, Indonesia using 40 samples (Holstein Friesian cattle, lactation period II-III and lactation month 3-4). Data were analyzed using linear and quadratic regressions, to predict the production and quality of milk protein from feed energy and protein ratio that describe the diet. The significance of model was tested using analysis of variance. Coefficient of determination (R2), residual variance (RV) and root mean square prediction error (RMSPE) were reported for the developed equations as an indicator of the goodness of model fit. The results showed no relationship in milk protein (kg), milk casein (%), milk casein (kg) and milk urea N (mg/dl) as function of CP/TDN. The significant relationship was observed in milk production (L or kg) and milk protein (%) as function of CP/TDN, both in linear and quadratic models. In addition, a quadratic change in milk production (L) (P = 0.003), milk production (kg) (P = 0.003) and milk protein concentration (%) (P = 0.026) were observed with increase of CP/TDN. It can be concluded that quadratic equation was the good fitting model for this research, because quadratic equation has larger R2, smaller RV and smaller RMSPE than those of linear equation.
NASA Astrophysics Data System (ADS)
Brokamp, Cole; Jandarov, Roman; Rao, M. B.; LeMasters, Grace; Ryan, Patrick
2017-02-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.
Brokamp, Cole; Jandarov, Roman; Rao, M B; LeMasters, Grace; Ryan, Patrick
2017-02-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.
Brokamp, Cole; Jandarov, Roman; Rao, M.B.; LeMasters, Grace; Ryan, Patrick
2017-01-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment. PMID:28959135
NASA Technical Reports Server (NTRS)
Daniels, Janet L.; Smith, G. Louis; Priestley, Kory J.; Thomas, Susan
2014-01-01
Validation of in-orbit instrument performance is a function of stability in both instrument and calibration source. This paper describes a method using lunar observations scanning near full moon by the Clouds and Earth Radiant Energy System (CERES) instruments. The Moon offers an external source whose signal variance is predictable and non-degrading. From 2006 to present, these in-orbit observations have become standardized and compiled for the Flight Models -1 and -2 aboard the Terra satellite, for Flight Models-3 and -4 aboard the Aqua satellite, and beginning 2012, for Flight Model-5 aboard Suomi-NPP. Instrument performance measurements studied are detector sensitivity stability, pointing accuracy and static detector point response function. This validation method also shows trends per CERES data channel of 0.8% per decade or less for Flight Models 1-4. Using instrument gimbal data and computed lunar position, the pointing error of each detector telescope, the accuracy and consistency of the alignment between the detectors can be determined. The maximum pointing error was 0.2 Deg. in azimuth and 0.17 Deg. in elevation which corresponds to an error in geolocation near nadir of 2.09 km. With the exception of one detector, all instruments were found to have consistent detector alignment from 2006 to present. All alignment error was within 0.1o with most detector telescopes showing a consistent alignment offset of less than 0.02 Deg.
Anticipatory synergy adjustments reflect individual performance of feedforward force control.
Togo, Shunta; Imamizu, Hiroshi
2016-10-06
We grasp and dexterously manipulate an object through multi-digit synergy. In the framework of the uncontrolled manifold (UCM) hypothesis, multi-digit synergy is defined as the coordinated control mechanism of fingers to stabilize variable important for task success, e.g., total force. Previous studies reported anticipatory synergy adjustments (ASAs) that correspond to a drop of the synergy index before a quick change of the total force. The present study compared ASA's properties with individual performances of feedforward force control to investigate a relationship of those. Subjects performed a total finger force production task that consisted of a phase in which subjects tracked target line with visual information and a phase in which subjects produced total force pulse without visual information. We quantified their multi-digit synergy through UCM analysis and observed significant ASAs before producing total force pulse. The time of the ASA initiation and the magnitude of the drop of the synergy index were significantly correlated with the error of force pulse, but not with the tracking error. Almost all subjects showed a significant increase of the variance that affected the total force. Our study directly showed that ASA reflects the individual performance of feedforward force control independently of target-tracking performance and suggests that the multi-digit synergy was weakened to adjust the multi-digit movements based on a prediction error so as to reduce the future error. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Kim, Hyunji; Schimmack, Ulrich; Oishi, Shigehiro
2012-04-01
Anusic, Schimmack, Pinkus, and Lockwood (2009) developed the halo-alpha-beta (HAB) model to separate halo variance from variance due to valid personality traits and other sources of measurement error in self-ratings of personality. The authors used a twin-HAB model of self-ratings and ratings of a partner (friend or dating partner) to test several hypotheses about culture, evaluative biases in self- and other-perceptions, and well-being. Participants were friends or dating partners who reported on their own and their partner's personality and well-being (N = 906 students). European Canadians had higher general evaluative biases (GEB) than Asian Canadians. There were no cultural differences in self-enhancement or other-enhancement. GEB significantly predicted self-ratings of life satisfaction, but not informant ratings of well-being. GEB fully mediated the effect of culture on self-ratings of life satisfaction. The results suggest that North American culture encourages positive biases in self- and other-perceptions. These biases also influence self-ratings of life satisfaction but have a much weaker effect on informant ratings of life satisfaction. The implications of these findings for cultural differences in well-being are discussed. (c) 2012 APA, all rights reserved.
Zhang, Yongsheng; Wei, Heng; Zheng, Kangning
2017-01-01
Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
Relationship between consonant recognition in noise and hearing threshold.
Yoon, Yang-soo; Allen, Jont B; Gooler, David M
2012-04-01
Although poorer understanding of speech in noise by listeners who are hearing-impaired (HI) is known not to be directly related to audiometric hearing threshold, HT (f), grouping HI listeners with HT (f) is widely practiced. In this article, the relationship between consonant recognition and HT (f) is considered over a range of signal-to-noise ratios (SNRs). Confusion matrices (CMs) from 25 HI ears were generated in response to 16 consonant-vowel syllables presented at 6 different SNRs. Individual differences scaling (INDSCAL) was applied to both feature-based matrices and CMs in order to evaluate the relationship between HT (f) and consonant recognition among HI listeners. The results showed no predictive relationship between the percent error scores (Pe) and HT (f) across SNRs. The multiple regression models showed that the HT (f) accounted for 39% of the total variance of the slopes of the Pe. Feature-based INDSCAL analysis showed consistent grouping of listeners across SNRs, but not in terms of HT (f). Systematic relationship between measures was also not defined by CM-based INDSCAL analysis across SNRs. HT (f) did not account for the majority of the variance (39%) in consonant recognition in noise when the complete body of the CM was considered.
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM.
Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei; Song, Houbing
2018-01-15
Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model's performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM's parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models' performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors.
Estimation of genetic parameters for milk yield in Murrah buffaloes by Bayesian inference.
Breda, F C; Albuquerque, L G; Euclydes, R F; Bignardi, A B; Baldi, F; Torres, R A; Barbosa, L; Tonhati, H
2010-02-01
Random regression models were used to estimate genetic parameters for test-day milk yield in Murrah buffaloes using Bayesian inference. Data comprised 17,935 test-day milk records from 1,433 buffaloes. Twelve models were tested using different combinations of third-, fourth-, fifth-, sixth-, and seventh-order orthogonal polynomials of weeks of lactation for additive genetic and permanent environmental effects. All models included the fixed effects of contemporary group, number of daily milkings and age of cow at calving as covariate (linear and quadratic effect). In addition, residual variances were considered to be heterogeneous with 6 classes of variance. Models were selected based on the residual mean square error, weighted average of residual variance estimates, and estimates of variance components, heritabilities, correlations, eigenvalues, and eigenfunctions. Results indicated that changes in the order of fit for additive genetic and permanent environmental random effects influenced the estimation of genetic parameters. Heritability estimates ranged from 0.19 to 0.31. Genetic correlation estimates were close to unity between adjacent test-day records, but decreased gradually as the interval between test-days increased. Results from mean squared error and weighted averages of residual variance estimates suggested that a model considering sixth- and seventh-order Legendre polynomials for additive and permanent environmental effects, respectively, and 6 classes for residual variances, provided the best fit. Nevertheless, this model presented the largest degree of complexity. A more parsimonious model, with fourth- and sixth-order polynomials, respectively, for these same effects, yielded very similar genetic parameter estimates. Therefore, this last model is recommended for routine applications. Copyright 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Harasym, Peter H; Woloschuk, Wayne; Cunning, Leslie
2008-12-01
Physician-patient communication is a clinical skill that can be learned and has a positive impact on patient satisfaction and health outcomes. A concerted effort at all medical schools is now directed at teaching and evaluating this core skill. Student communication skills are often assessed by an Objective Structure Clinical Examination (OSCE). However, it is unknown what sources of error variance are introduced into examinee communication scores by various OSCE components. This study primarily examined the effect different examiners had on the evaluation of students' communication skills assessed at the end of a family medicine clerkship rotation. The communication performance of clinical clerks from Classes 2005 and 2006 were assessed using six OSCE stations. Performance was rated at each station using the 28-item Calgary-Cambridge guide. Item Response Theory analysis using a Multifaceted Rasch model was used to partition the various sources of error variance and generate a "true" communication score where the effects of examiner, case, and items are removed. Variance and reliability of scores were as follows: communication scores (.20 and .87), examiner stringency/leniency (.86 and .91), case (.03 and .96), and item (.86 and .99), respectively. All facet scores were reliable (.87-.99). Examiner variance (.86) was more than four times the examinee variance (.20). About 11% of the clerks' outcome status shifted using "true" rather than observed/raw scores. There was large variability in examinee scores due to variation in examiner stringency/leniency behaviors that may impact pass-fail decisions. Exploring the benefits of examiner training and employing "true" scores generated using Item Response Theory analyses prior to making pass/fail decisions are recommended.
Validating Variance Similarity Functions in the Entrainment Zone
NASA Astrophysics Data System (ADS)
Osman, M.; Turner, D. D.; Heus, T.; Newsom, R. K.
2017-12-01
In previous work, the water vapor variance in the entrainment zone was proposed to be proportional to the convective velocity scale, gradient water vapor mixing ratio and the Brunt-Vaisala frequency in the interfacial layer, while the variance of the vertical wind at in the entrainment zone was defined in terms of the convective velocity scale. The variances in the entrainment zone have been hypothesized to depend on two distinct functions, which also depend on the Richardson number. To the best of our knowledge, these hypotheses have never been tested observationally. Simultaneous measurements of the Eddy correlation surface flux, wind shear profiles from wind profilers, and variance profile measurements of vertical motions and water vapor by Doppler and Raman lidars, respectively, provide a unique opportunity to thoroughly examine the functions used in defining the variances and validate them. These observations were made over the Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP) site. We have identified about 30 cases from 2016 during which the convective boundary layer (CBL) is quasi-stationary and well mixed for at least 2 hours. The vertical profiles of turbulent fluctuations of the vertical wind and water vapor have been derived using an auto covariance technique to separate out the instrument random error to a set of 2-h period time series. The error analysis of the lidars observations demonstrates that the lidars are capable of resolving the vertical structure of turbulence around the entrainment zone. Therefore, utilizing this unique combination of observations, this study focuses on extensively testing the hypotheses that the second-order moments are indeed proportional to the functions which also depend on Richardson number. The coefficients that are used in defining the functions will also be determined observationally and compared against with the values suggested by Large eddy simulation (LES) studies.
Optimal design criteria - prediction vs. parameter estimation
NASA Astrophysics Data System (ADS)
Waldl, Helmut
2014-05-01
G-optimality is a popular design criterion for optimal prediction, it tries to minimize the kriging variance over the whole design region. A G-optimal design minimizes the maximum variance of all predicted values. If we use kriging methods for prediction it is self-evident to use the kriging variance as a measure of uncertainty for the estimates. Though the computation of the kriging variance and even more the computation of the empirical kriging variance is computationally very costly and finding the maximum kriging variance in high-dimensional regions can be time demanding such that we cannot really find the G-optimal design with nowadays available computer equipment in practice. We cannot always avoid this problem by using space-filling designs because small designs that minimize the empirical kriging variance are often non-space-filling. D-optimality is the design criterion related to parameter estimation. A D-optimal design maximizes the determinant of the information matrix of the estimates. D-optimality in terms of trend parameter estimation and D-optimality in terms of covariance parameter estimation yield basically different designs. The Pareto frontier of these two competing determinant criteria corresponds with designs that perform well under both criteria. Under certain conditions searching the G-optimal design on the above Pareto frontier yields almost as good results as searching the G-optimal design in the whole design region. In doing so the maximum of the empirical kriging variance has to be computed only a few times though. The method is demonstrated by means of a computer simulation experiment based on data provided by the Belgian institute Management Unit of the North Sea Mathematical Models (MUMM) that describe the evolution of inorganic and organic carbon and nutrients, phytoplankton, bacteria and zooplankton in the Southern Bight of the North Sea.
An Empirical State Error Covariance Matrix for Batch State Estimation
NASA Technical Reports Server (NTRS)
Frisbee, Joseph H., Jr.
2011-01-01
State estimation techniques serve effectively to provide mean state estimates. However, the state error covariance matrices provided as part of these techniques suffer from some degree of lack of confidence in their ability to adequately describe the uncertainty in the estimated states. A specific problem with the traditional form of state error covariance matrices is that they represent only a mapping of the assumed observation error characteristics into the state space. Any errors that arise from other sources (environment modeling, precision, etc.) are not directly represented in a traditional, theoretical state error covariance matrix. Consider that an actual observation contains only measurement error and that an estimated observation contains all other errors, known and unknown. It then follows that a measurement residual (the difference between expected and observed measurements) contains all errors for that measurement. Therefore, a direct and appropriate inclusion of the actual measurement residuals in the state error covariance matrix will result in an empirical state error covariance matrix. This empirical state error covariance matrix will fully account for the error in the state estimate. By way of a literal reinterpretation of the equations involved in the weighted least squares estimation algorithm, it is possible to arrive at an appropriate, and formally correct, empirical state error covariance matrix. The first specific step of the method is to use the average form of the weighted measurement residual variance performance index rather than its usual total weighted residual form. Next it is helpful to interpret the solution to the normal equations as the average of a collection of sample vectors drawn from a hypothetical parent population. From here, using a standard statistical analysis approach, it directly follows as to how to determine the standard empirical state error covariance matrix. This matrix will contain the total uncertainty in the state estimate, regardless as to the source of the uncertainty. Also, in its most straight forward form, the technique only requires supplemental calculations to be added to existing batch algorithms. The generation of this direct, empirical form of the state error covariance matrix is independent of the dimensionality of the observations. Mixed degrees of freedom for an observation set are allowed. As is the case with any simple, empirical sample variance problems, the presented approach offers an opportunity (at least in the case of weighted least squares) to investigate confidence interval estimates for the error covariance matrix elements. The diagonal or variance terms of the error covariance matrix have a particularly simple form to associate with either a multiple degree of freedom chi-square distribution (more approximate) or with a gamma distribution (less approximate). The off diagonal or covariance terms of the matrix are less clear in their statistical behavior. However, the off diagonal covariance matrix elements still lend themselves to standard confidence interval error analysis. The distributional forms associated with the off diagonal terms are more varied and, perhaps, more approximate than those associated with the diagonal terms. Using a simple weighted least squares sample problem, results obtained through use of the proposed technique are presented. The example consists of a simple, two observer, triangulation problem with range only measurements. Variations of this problem reflect an ideal case (perfect knowledge of the range errors) and a mismodeled case (incorrect knowledge of the range errors).
Data assimilation method based on the constraints of confidence region
NASA Astrophysics Data System (ADS)
Li, Yong; Li, Siming; Sheng, Yao; Wang, Luheng
2018-03-01
The ensemble Kalman filter (EnKF) is a distinguished data assimilation method that is widely used and studied in various fields including methodology and oceanography. However, due to the limited sample size or imprecise dynamics model, it is usually easy for the forecast error variance to be underestimated, which further leads to the phenomenon of filter divergence. Additionally, the assimilation results of the initial stage are poor if the initial condition settings differ greatly from the true initial state. To address these problems, the variance inflation procedure is usually adopted. In this paper, we propose a new method based on the constraints of a confidence region constructed by the observations, called EnCR, to estimate the inflation parameter of the forecast error variance of the EnKF method. In the new method, the state estimate is more robust to both the inaccurate forecast models and initial condition settings. The new method is compared with other adaptive data assimilation methods in the Lorenz-63 and Lorenz-96 models under various model parameter settings. The simulation results show that the new method performs better than the competing methods.
NASA Astrophysics Data System (ADS)
Dilla, Shintia Ulfa; Andriyana, Yudhie; Sudartianto
2017-03-01
Acid rain causes many bad effects in life. It is formed by two strong acids, sulfuric acid (H2SO4) and nitric acid (HNO3), where sulfuric acid is derived from SO2 and nitric acid from NOx {x=1,2}. The purpose of the research is to find out the influence of So4 and NO3 levels contained in the rain to the acidity (pH) of rainwater. The data are incomplete panel data with two-way error component model. The panel data is a collection of some of the observations that observed from time to time. It is said incomplete if each individual has a different amount of observation. The model used in this research is in the form of random effects model (REM). Minimum variance quadratic unbiased estimation (MIVQUE) is used to estimate the variance error components, while maximum likelihood estimation is used to estimate the parameters. As a result, we obtain the following model: Ŷ* = 0.41276446 - 0.00107302X1 + 0.00215470X2.
Harris, Alexandre M.; DeGiorgio, Michael
2016-01-01
Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781
Can, Dilara Deniz; Ginsburg-Block, Marika; Golinkoff, Roberta Michnick; Hirsh-Pasek, Kathryn
2013-09-01
This longitudinal study examined the predictive validity of the MacArthur Communicative Developmental Inventories-Short Form (CDI-SF), a parent report questionnaire about children's language development (Fenson, Pethick, Renda, Cox, Dale & Reznick, 2000). Data were first gathered from parents on the CDI-SF vocabulary scores for seventy-six children (mean age=1 ; 10). Four years later (mean age=6 ; 1), children were assessed on language outcomes (expressive vocabulary, syntax, semantics and pragmatics) and code-related skills, including phonemic awareness, word recognition and decoding skills. Hierarchical regression analyses revealed that early expressive vocabulary accounted for 17% of the variance in picture vocabulary, 11% of the variance in syntax, and 7% of the variance in semantics, while not accounting for any variance in pragmatics in kindergarten. CDI-SF scores did not predict code-related skills in kindergarten. The importance of early vocabulary skills for later language development and CDI-SF as a valuable research tool are discussed.
Estimation of sampling error uncertainties in observed surface air temperature change in China
NASA Astrophysics Data System (ADS)
Hua, Wei; Shen, Samuel S. P.; Weithmann, Alexander; Wang, Huijun
2017-08-01
This study examines the sampling error uncertainties in the monthly surface air temperature (SAT) change in China over recent decades, focusing on the uncertainties of gridded data, national averages, and linear trends. Results indicate that large sampling error variances appear at the station-sparse area of northern and western China with the maximum value exceeding 2.0 K2 while small sampling error variances are found at the station-dense area of southern and eastern China with most grid values being less than 0.05 K2. In general, the negative temperature existed in each month prior to the 1980s, and a warming in temperature began thereafter, which accelerated in the early and mid-1990s. The increasing trend in the SAT series was observed for each month of the year with the largest temperature increase and highest uncertainty of 0.51 ± 0.29 K (10 year)-1 occurring in February and the weakest trend and smallest uncertainty of 0.13 ± 0.07 K (10 year)-1 in August. The sampling error uncertainties in the national average annual mean SAT series are not sufficiently large to alter the conclusion of the persistent warming in China. In addition, the sampling error uncertainties in the SAT series show a clear variation compared with other uncertainty estimation methods, which is a plausible reason for the inconsistent variations between our estimate and other studies during this period.
Systematic Error Study for ALICE charged-jet v2 Measurement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heinz, M.; Soltz, R.
We study the treatment of systematic errors in the determination of v 2 for charged jets in √ sNN = 2:76 TeV Pb-Pb collisions by the ALICE Collaboration. Working with the reported values and errors for the 0-5% centrality data we evaluate the Χ 2 according to the formulas given for the statistical and systematic errors, where the latter are separated into correlated and shape contributions. We reproduce both the Χ 2 and p-values relative to a null (zero) result. We then re-cast the systematic errors into an equivalent co-variance matrix and obtain identical results, demonstrating that the two methodsmore » are equivalent.« less
Nelson, Lindsay D.; Patrick, Christopher J.; Bernat, Edward M.
2010-01-01
The externalizing dimension is viewed as a broad dispositional factor underlying risk for numerous disinhibitory disorders. Prior work has documented deficits in event-related brain potential (ERP) responses in individuals prone to externalizing problems. Here, we constructed a direct physiological index of externalizing vulnerability from three ERP indicators and evaluated its validity in relation to criterion measures in two distinct domains: psychometric and physiological. The index was derived from three ERP measures that covaried in their relations with externalizing proneness the error-related negativity and two variants of the P3. Scores on this ERP composite predicted psychometric criterion variables and accounted for externalizing-related variance in P3 response from a separate task. These findings illustrate how a diagnostic construct can be operationalized as a composite (multivariate) psychophysiological variable (phenotype). PMID:20573054
Perfectionism and Personality Disorders as Predictors of Symptoms and Interpersonal Problems.
Dimaggio, Giancarlo; Lysaker, Paul H; Calarco, Teresa; Pedone, Roberto; Marsigli, Nicola; Riccardi, Ilaria; Sabatelli, Beatrice; Carcione, Antonino; Paviglianiti, Alessandra
2015-01-01
Maladaptive perfectionism is a common factor in many disorders and is correlated with some personality dysfunctions. Less clear is how dimensions, such as concern over mistakes, doubts about actions, and parental criticism, are linked to overall suffering. Additionally, correlations between perfectionism and personality disorders are poorly explored in clinical samples. In this study we compared a treatment seeking individuals (n=93) and a community sample (n=100) on dimensions of maladaptive perfectionism, personality disorders, symptoms, and interpersonal problems. Results in both samples revealed maladaptive perfectionism was strongly associated with general suffering, interpersonal problems, and a broad range of personality disordered traits. Excessive concern over one's errors, and to some extent doubts about actions, predicted unique additional variance beyond the presence of personality pathology in explaining symptoms and interpersonal problems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gonçalves, Fabio; Treuhaft, Robert; Law, Beverly
Mapping and monitoring of forest carbon stocks across large areas in the tropics will necessarily rely on remote sensing approaches, which in turn depend on field estimates of biomass for calibration and validation purposes. Here, we used field plot data collected in a tropical moist forest in the central Amazon to gain a better understanding of the uncertainty associated with plot-level biomass estimates obtained specifically for the calibration of remote sensing measurements. In addition to accounting for sources of error that would be normally expected in conventional biomass estimates (e.g., measurement and allometric errors), we examined two sources of uncertaintymore » that are specific to the calibration process and should be taken into account in most remote sensing studies: the error resulting from spatial disagreement between field and remote sensing measurements (i.e., co-location error), and the error introduced when accounting for temporal differences in data acquisition. We found that the overall uncertainty in the field biomass was typically 25% for both secondary and primary forests, but ranged from 16 to 53%. Co-location and temporal errors accounted for a large fraction of the total variance (>65%) and were identified as important targets for reducing uncertainty in studies relating tropical forest biomass to remotely sensed data. Although measurement and allometric errors were relatively unimportant when considered alone, combined they accounted for roughly 30% of the total variance on average and should not be ignored. Lastly, our results suggest that a thorough understanding of the sources of error associated with field-measured plot-level biomass estimates in tropical forests is critical to determine confidence in remote sensing estimates of carbon stocks and fluxes, and to develop strategies for reducing the overall uncertainty of remote sensing approaches.« less
Can Family Planning Service Statistics Be Used to Track Population-Level Outcomes?
Magnani, Robert J; Ross, John; Williamson, Jessica; Weinberger, Michelle
2018-03-21
The need for annual family planning program tracking data under the Family Planning 2020 (FP2020) initiative has contributed to renewed interest in family planning service statistics as a potential data source for annual estimates of the modern contraceptive prevalence rate (mCPR). We sought to assess (1) how well a set of commonly recorded data elements in routine service statistics systems could, with some fairly simple adjustments, track key population-level outcome indicators, and (2) whether some data elements performed better than others. We used data from 22 countries in Africa and Asia to analyze 3 data elements collected from service statistics: (1) number of contraceptive commodities distributed to clients, (2) number of family planning service visits, and (3) number of current contraceptive users. Data quality was assessed via analysis of mean square errors, using the United Nations Population Division World Contraceptive Use annual mCPR estimates as the "gold standard." We also examined the magnitude of several components of measurement error: (1) variance, (2) level bias, and (3) slope (or trend) bias. Our results indicate modest levels of tracking error for data on commodities to clients (7%) and service visits (10%), and somewhat higher error rates for data on current users (19%). Variance and slope bias were relatively small for all data elements. Level bias was by far the largest contributor to tracking error. Paired comparisons of data elements in countries that collected at least 2 of the 3 data elements indicated a modest advantage of data on commodities to clients. None of the data elements considered was sufficiently accurate to be used to produce reliable stand-alone annual estimates of mCPR. However, the relatively low levels of variance and slope bias indicate that trends calculated from these 3 data elements can be productively used in conjunction with the Family Planning Estimation Tool (FPET) currently used to produce annual mCPR tracking estimates for FP2020. © Magnani et al.
Jacob, Benjamin G; Griffith, Daniel A; Muturi, Ephantus J; Caamano, Erick X; Githure, John I; Novak, Robert J
2009-01-01
Background Autoregressive regression coefficients for Anopheles arabiensis aquatic habitat models are usually assessed using global error techniques and are reported as error covariance matrices. A global statistic, however, will summarize error estimates from multiple habitat locations. This makes it difficult to identify where there are clusters of An. arabiensis aquatic habitats of acceptable prediction. It is therefore useful to conduct some form of spatial error analysis to detect clusters of An. arabiensis aquatic habitats based on uncertainty residuals from individual sampled habitats. In this research, a method of error estimation for spatial simulation models was demonstrated using autocorrelation indices and eigenfunction spatial filters to distinguish among the effects of parameter uncertainty on a stochastic simulation of ecological sampled Anopheles aquatic habitat covariates. A test for diagnostic checking error residuals in an An. arabiensis aquatic habitat model may enable intervention efforts targeting productive habitats clusters, based on larval/pupal productivity, by using the asymptotic distribution of parameter estimates from a residual autocovariance matrix. The models considered in this research extends a normal regression analysis previously considered in the literature. Methods Field and remote-sampled data were collected during July 2006 to December 2007 in Karima rice-village complex in Mwea, Kenya. SAS 9.1.4® was used to explore univariate statistics, correlations, distributions, and to generate global autocorrelation statistics from the ecological sampled datasets. A local autocorrelation index was also generated using spatial covariance parameters (i.e., Moran's Indices) in a SAS/GIS® database. The Moran's statistic was decomposed into orthogonal and uncorrelated synthetic map pattern components using a Poisson model with a gamma-distributed mean (i.e. negative binomial regression). The eigenfunction values from the spatial configuration matrices were then used to define expectations for prior distributions using a Markov chain Monte Carlo (MCMC) algorithm. A set of posterior means were defined in WinBUGS 1.4.3®. After the model had converged, samples from the conditional distributions were used to summarize the posterior distribution of the parameters. Thereafter, a spatial residual trend analyses was used to evaluate variance uncertainty propagation in the model using an autocovariance error matrix. Results By specifying coefficient estimates in a Bayesian framework, the covariate number of tillers was found to be a significant predictor, positively associated with An. arabiensis aquatic habitats. The spatial filter models accounted for approximately 19% redundant locational information in the ecological sampled An. arabiensis aquatic habitat data. In the residual error estimation model there was significant positive autocorrelation (i.e., clustering of habitats in geographic space) based on log-transformed larval/pupal data and the sampled covariate depth of habitat. Conclusion An autocorrelation error covariance matrix and a spatial filter analyses can prioritize mosquito control strategies by providing a computationally attractive and feasible description of variance uncertainty estimates for correctly identifying clusters of prolific An. arabiensis aquatic habitats based on larval/pupal productivity. PMID:19772590
Analytic variance estimates of Swank and Fano factors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gutierrez, Benjamin; Badano, Aldo; Samuelson, Frank, E-mail: frank.samuelson@fda.hhs.gov
Purpose: Variance estimates for detector energy resolution metrics can be used as stopping criteria in Monte Carlo simulations for the purpose of ensuring a small uncertainty of those metrics and for the design of variance reduction techniques. Methods: The authors derive an estimate for the variance of two energy resolution metrics, the Swank factor and the Fano factor, in terms of statistical moments that can be accumulated without significant computational overhead. The authors examine the accuracy of these two estimators and demonstrate how the estimates of the coefficient of variation of the Swank and Fano factors behave with data frommore » a Monte Carlo simulation of an indirect x-ray imaging detector. Results: The authors' analyses suggest that the accuracy of their variance estimators is appropriate for estimating the actual variances of the Swank and Fano factors for a variety of distributions of detector outputs. Conclusions: The variance estimators derived in this work provide a computationally convenient way to estimate the error or coefficient of variation of the Swank and Fano factors during Monte Carlo simulations of radiation imaging systems.« less
An application of the LC-LSTM framework to the self-esteem instability case.
Alessandri, Guido; Vecchione, Michele; Donnellan, Brent M; Tisak, John
2013-10-01
The present research evaluates the stability of self-esteem as assessed by a daily version of the Rosenberg (Society and the adolescent self-image, Princeton University Press, Princeton, 1965) general self-esteem scale (RGSE). The scale was administered to 391 undergraduates for five consecutive days. The longitudinal data were analyzed using the integrated LC-LSTM framework that allowed us to evaluate: (1) the measurement invariance of the RGSE, (2) its stability and change across the 5-day assessment period, (3) the amount of variance attributable to stable and transitory latent factors, and (4) the criterion-related validity of these factors. Results provided evidence for measurement invariance, mean-level stability, and rank-order stability of daily self-esteem. Latent state-trait analyses revealed that variances in scores of the RGSE can be decomposed into six components: stable self-esteem (40 %), ephemeral (or temporal-state) variance (36 %), stable negative method variance (9 %), stable positive method variance (4 %), specific variance (1 %) and random error variance (10 %). Moreover, latent factors associated with daily self-esteem were associated with measures of depression, implicit self-esteem, and grade point average.
Analysis of conditional genetic effects and variance components in developmental genetics.
Zhu, J
1995-12-01
A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.
Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics
Zhu, J.
1995-01-01
A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500
Adaptive use of research aircraft data sets for hurricane forecasts
NASA Astrophysics Data System (ADS)
Biswas, M. K.; Krishnamurti, T. N.
2008-02-01
This study uses an adaptive observational strategy for hurricane forecasting. It shows the impacts of Lidar Atmospheric Sensing Experiment (LASE) and dropsonde data sets from Convection and Moisture Experiment (CAMEX) field campaigns on hurricane track and intensity forecasts. The following cases are used in this study: Bonnie, Danielle and Georges of 1998 and Erin, Gabrielle and Humberto of 2001. A single model run for each storm is carried out using the Florida State University Global Spectral Model (FSUGSM) with the European Center for Medium Range Weather Forecasts (ECMWF) analysis as initial conditions, in addition to 50 other model runs where the analysis is randomly perturbed for each storm. The centers of maximum variance of the DLM heights are located from the forecast error variance fields at the 84-hr forecast. Back correlations are then performed using the centers of these maximum variances and the fields at the 36-hr forecast. The regions having the highest correlations in the vicinity of the hurricanes are indicative of regions from where the error growth emanates and suggests the need for additional observations. Data sets are next assimilated in those areas that contain high correlations. Forecasts are computed using the new initial conditions for the storm cases, and track and intensity skills are then examined with respect to the control forecast. The adaptive strategy is capable of identifying sensitive areas where additional observations can help in reducing the hurricane track forecast errors. A reduction of position error by approximately 52% for day 3 of forecast (averaged over 7 storm cases) over the control runs is observed. The intensity forecast shows only a slight positive impact due to the model’s coarse resolution.
Newell, Felicity L.; Sheehan, James; Wood, Petra Bohall; Rodewald, Amanda D.; Buehler, David A.; Keyser, Patrick D.; Larkin, Jeffrey L.; Beachy, Tiffany A.; Bakermans, Marja H.; Boves, Than J.; Evans, Andrea; George, Gregory A.; McDermott, Molly E.; Perkins, Kelly A.; White, Matthew; Wigley, T. Bently
2013-01-01
Point counts are commonly used to assess changes in bird abundance, including analytical approaches such as distance sampling that estimate density. Point-count methods have come under increasing scrutiny because effects of detection probability and field error are difficult to quantify. For seven forest songbirds, we compared fixed-radii counts (50 m and 100 m) and density estimates obtained from distance sampling to known numbers of birds determined by territory mapping. We applied point-count analytic approaches to a typical forest management question and compared results to those obtained by territory mapping. We used a before–after control impact (BACI) analysis with a data set collected across seven study areas in the central Appalachians from 2006 to 2010. Using a 50-m fixed radius, variance in error was at least 1.5 times that of the other methods, whereas a 100-m fixed radius underestimated actual density by >3 territories per 10 ha for the most abundant species. Distance sampling improved accuracy and precision compared to fixed-radius counts, although estimates were affected by birds counted outside 10-ha units. In the BACI analysis, territory mapping detected an overall treatment effect for five of the seven species, and effects were generally consistent each year. In contrast, all point-count methods failed to detect two treatment effects due to variance and error in annual estimates. Overall, our results highlight the need for adequate sample sizes to reduce variance, and skilled observers to reduce the level of error in point-count data. Ultimately, the advantages and disadvantages of different survey methods should be considered in the context of overall study design and objectives, allowing for trade-offs among effort, accuracy, and power to detect treatment effects.
Gebreyesus, Grum; Lund, Mogens S; Buitenhuis, Bart; Bovenhuis, Henk; Poulsen, Nina A; Janss, Luc G
2017-12-05
Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in prediction reliability compared to the univariate versions. Substantial improvement in prediction reliability was possible for most of the traits related to milk protein composition using our novel BayesAS models. Grouping adjacent SNPs into segments provided enhanced information to estimate parameters and allowing the segments to have different (co)variances helped disentangle heterogeneous (co)variances across the genome.
Evaluating concentration estimation errors in ELISA microarray experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daly, Don S.; White, Amanda M.; Varnum, Susan M.
Enzyme-linked immunosorbent assay (ELISA) is a standard immunoassay to predict a protein concentration in a sample. Deploying ELISA in a microarray format permits simultaneous prediction of the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Evaluating prediction error is critical to interpreting biological significance and improving the ELISA microarray process. Evaluating prediction error must be automated to realize a reliable high-throughput ELISA microarray system. Methods: In this paper, we present a statistical method based on propagation of error to evaluate prediction errors in the ELISA microarray process. Althoughmore » propagation of error is central to this method, it is effective only when comparable data are available. Therefore, we briefly discuss the roles of experimental design, data screening, normalization and statistical diagnostics when evaluating ELISA microarray prediction errors. We use an ELISA microarray investigation of breast cancer biomarkers to illustrate the evaluation of prediction errors. The illustration begins with a description of the design and resulting data, followed by a brief discussion of data screening and normalization. In our illustration, we fit a standard curve to the screened and normalized data, review the modeling diagnostics, and apply propagation of error.« less
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
NASA Astrophysics Data System (ADS)
Ma, Yuanxu; Huang, He Qing
2016-07-01
Accurate estimation of flow resistance is crucial for flood routing, flow discharge and velocity estimation, and engineering design. Various empirical and semiempirical flow resistance models have been developed during the past century; however, a universal flow resistance model for varying types of rivers has remained difficult to be achieved to date. In this study, hydrometric data sets from six stations in the lower Yellow River during 1958-1959 are used to calibrate three empirical flow resistance models (Eqs. (5)-(7)) and evaluate their predictability. A group of statistical measures have been used to evaluate the goodness of fit of these models, including root mean square error (RMSE), coefficient of determination (CD), the Nash coefficient (NA), mean relative error (MRE), mean symmetry error (MSE), percentage of data with a relative error ≤ 50% and 25% (P50, P25), and percentage of data with overestimated error (POE). Three model selection criterions are also employed to assess the model predictability: Akaike information criterion (AIC), Bayesian information criterion (BIC), and a modified model selection criterion (MSC). The results show that mean flow depth (d) and water surface slope (S) can only explain a small proportion of variance in flow resistance. When channel width (w) and suspended sediment concentration (SSC) are involved, the new model (7) achieves a better performance than the previous ones. The MRE of model (7) is generally < 20%, which is apparently better than that reported by previous studies. This model is validated using the data sets from the corresponding stations during 1965-1966, and the results show larger uncertainties than the calibrating model. This probably resulted from the temporal shift of dominant controls caused by channel change resulting from varying flow regime. With the advancements of earth observation techniques, information about channel width, mean flow depth, and suspended sediment concentration can be effectively extracted from multisource satellite images. We expect that the empirical methods developed in this study can be used as an effective surrogate in estimation of flow resistance in the large sand-bed rivers like the lower Yellow River.
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM
Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei
2018-01-01
Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model’s performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM’s parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models’ performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors. PMID:29342942
Non-stationary internal tides observed with satellite altimetry
NASA Astrophysics Data System (ADS)
Ray, R. D.; Zaron, E. D.
2011-09-01
Temporal variability of the internal tide is inferred from a 17-year combined record of Topex/Poseidon and Jason satellite altimeters. A global sampling of along-track sea-surface height wavenumber spectra finds that non-stationary variance is generally 25% or less of the average variance at wavenumbers characteristic of mode-1 tidal internal waves. With some exceptions the non-stationary variance does not exceed 0.25 cm2. The mode-2 signal, where detectable, contains a larger fraction of non-stationary variance, typically 50% or more. Temporal subsetting of the data reveals interannual variability barely significant compared with tidal estimation error from 3-year records. Comparison of summer vs. winter conditions shows only one region of noteworthy seasonal changes, the northern South China Sea. Implications for the anticipated SWOT altimeter mission are briefly discussed.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Non-Stationary Internal Tides Observed with Satellite Altimetry
NASA Technical Reports Server (NTRS)
Ray, Richard D.; Zaron, E. D.
2011-01-01
Temporal variability of the internal tide is inferred from a 17-year combined record of Topex/Poseidon and Jason satellite altimeters. A global sampling of along-track sea-surface height wavenumber spectra finds that non-stationary variance is generally 25% or less of the average variance at wavenumbers characteristic of mode-l tidal internal waves. With some exceptions the non-stationary variance does not exceed 0.25 sq cm. The mode-2 signal, where detectable, contains a larger fraction of non-stationary variance, typically 50% or more. Temporal subsetting of the data reveals interannual variability barely significant compared with tidal estimation error from 3-year records. Comparison of summer vs. winter conditions shows only one region of noteworthy seasonal changes, the northern South China Sea. Implications for the anticipated SWOT altimeter mission are briefly discussed.
Hudson, Nathan W.; Lucas, Richard E.; Donnellan, M. Brent; Kushlev, Kostadin
2017-01-01
Kushlev, Dunn, and Lucas (2015) found that income predicts less daily sadness—but not greater happiness—among Americans. The present study used longitudinal data from an approximately representative German sample to replicate and extend these findings. Our results largely replicated Kushlev and colleagues’: income predicted less daily sadness (albeit with a smaller effect size), but was unrelated to happiness. Moreover, the association between income and sadness could not be explained by demographics, stress, or daily time-use. Extending Kushlev and colleagues’ findings, new analyses indicated that only between-persons variance in income (but not within-persons variance) predicted daily sadness—perhaps because there was relatively little within-persons variance in income. Finally, income predicted less daily sadness and worry, but not less anger or frustration—potentially suggesting that income predicts less “internalizing” but not less “externalizing” negative emotions. Together, our study and Kushlev and colleagues’ provide evidence that income robustly predicts select daily negative emotions—but not positive ones. PMID:29250303
NASA Astrophysics Data System (ADS)
Duan, Wansuo; Zhao, Peng
2017-04-01
Within the Zebiak-Cane model, the nonlinear forcing singular vector (NFSV) approach is used to investigate the role of model errors in the "Spring Predictability Barrier" (SPB) phenomenon within ENSO predictions. NFSV-related errors have the largest negative effect on the uncertainties of El Niño predictions. NFSV errors can be classified into two types: the first is characterized by a zonal dipolar pattern of SST anomalies (SSTA), with the western poles centered in the equatorial central-western Pacific exhibiting positive anomalies and the eastern poles in the equatorial eastern Pacific exhibiting negative anomalies; and the second is characterized by a pattern almost opposite the first type. The first type of error tends to have the worst effects on El Niño growth-phase predictions, whereas the latter often yields the largest negative effects on decaying-phase predictions. The evolution of prediction errors caused by NFSV-related errors exhibits prominent seasonality, with the fastest error growth in the spring and/or summer seasons; hence, these errors result in a significant SPB related to El Niño events. The linear counterpart of NFSVs, the (linear) forcing singular vector (FSV), induces a less significant SPB because it contains smaller prediction errors. Random errors cannot generate a SPB for El Niño events. These results show that the occurrence of an SPB is related to the spatial patterns of tendency errors. The NFSV tendency errors cause the most significant SPB for El Niño events. In addition, NFSVs often concentrate these large value errors in a few areas within the equatorial eastern and central-western Pacific, which likely represent those areas sensitive to El Niño predictions associated with model errors. Meanwhile, these areas are also exactly consistent with the sensitive areas related to initial errors determined by previous studies. This implies that additional observations in the sensitive areas would not only improve the accuracy of the initial field but also promote the reduction of model errors to greatly improve ENSO forecasts.
Increasing potential predictability of Indian Summer monsoon active and break spells
NASA Astrophysics Data System (ADS)
Mani, N. J.; Goswami, B.
2009-12-01
An understanding of the limit on potential predictability is crucial for developing appropriate tools for extended range prediction of active/break spells of Indian summer monsoon (ISM). The global low frequency changes in climate modulate the annual cycle of the ISM and can influence the intrinsic predictability limit of the ISM intraseasonal oscillations (ISOs). Using 104 year (1901-2004) long daily rainfall data, the change in potential predictability of active and break spells are estimated by an empirical method. Using an ISO index based on 10-90 day filtered precipitation, Goswami and Xavier (2003)showed that the monsoon breaks are intrinsically more predictable (20-25 days) than the active conditions (10-15 days. In the present study, employing the same method in 15 year sliding windows, we found that the potential predictability of both active and break spells have undergone a rapid increase during the recent three decades. The potential predictability of active spells has shown an increase from 1 week to 2 weeks while that for break spells increased from 2 weeks to 3 weeks. This result is interesting and intriguing in the backdrop of recent finding that the potential predictability of monsoon weather has decreased substantially over the same period compared to earlier decades due to increased potential instability of the atmosphere. The possible role of internal dynamics and external forcing in producing this change has been explored. The variance among peak active/break conditions shows a steady decrease over the years, indicating a lesser event to event variability in the magnitude of ISO peak phases in recent years. The ISO predictability may be closely linked to the error energy cascading from the synoptic scales and the interaction between these scales. Computation of nonlinear kinetic energy exchange between synoptic and ISO scales in frequency domain, also support the notion of ineffectual influence of synoptic scale errors on the ISO scale.Ref: Goswami, B N and P K Xavier, 2003,GRL. 30(18), 1966, doi:10.1029/2003GL017,810, 2003. Fig 1. Change in potential predictability of rainfall ISO through a 15 year sliding window. a) potential predictability for evolution from active to break b) potential predictability for evolution from break to active.
Jensen's Inequality Predicts Effects of Environmental Variation
Jonathan J. Ruel; Matthew P. Ayres
1999-01-01
Many biologists now recognize that environmental variance can exert important effects on patterns and processes in nature that are independent of average conditions. Jenson's inequality is a mathematical proof that is seldom mentioned in the ecological literature but which provides a powerful tool for predicting some direct effects of environmental variance in...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Almasbahi, M.S.
In a world of generalized floating exchange rates, it is not enough to solve the problem of exchange rate policy by determining whether to peg or float the currency under consideration. It is also necessary to choose to what major currency to peg. The main purpose of this study is to investigate and determine empirically the optimum currency peg for the Saudi riyal. To accomplish this goal, a simple conventional trade model, that includes variables found in many other studies of import and export demand, was used. In addition, an exchange rate term was added as a separate independent variablemore » in the import and export demand equations in order to assess the effect of exchange rate on the trade flows. The criteria for the optimal currency peg in this study were based on two factors. First, the error statistics for projected imports and exports using alternative exchange rate regimes. Second, variances of projected imports, exports and trade balance using alternative exchange rate regimes. The exchange rate has a significant impact on the Saudia Arabian trade flows which implies that changes in the riyals value affect the Saudi trade deficit. Moreover, the exchange rate has a more powerful effect on its aggregate imports than on the world demand for its exports. There is also a strong support for the hypothesis that the exchange rate affects the value of the Saudi bilateral trade with its five major trade partners. On the aggregate level, the SDR peg seems to be the best currency peg for the Saudi riyal since it provides the best prediction errors and the lowest variance for the trade balance. Finally, on the disaggregate level, the US dollar provides the best performance and yields the best results among all the six currency pegs considered in this study.« less
Waxman, Justin P; Schmitz, Randy J; Shultz, Sandra J
2015-10-01
Hamstring stiffness (K(HAM)) and leg stiffness (K(LEG)) are commonly examined relative to athletic performance and injury risk. Given these may be modifiable, it is important to understand day-to-day variations inherent in these measures before use in training studies. In addition, the extent to which K(HAM) and K(LEG) measure similar active stiffness characteristics has not been established. We investigated the interday measurement consistency of K(HAM) and K(LEG), and examined the extent to which K(LEG) predicted K(HAM) in 6 males and 9 females. K(HAM) was moderately consistent day-to-day (ICC(2,5) = .71; SEM = 76.3 N·m(-1)), and 95% limits of agreement (95% LOA) revealed a systematic bias with considerable absolute measurement error (95% LOA = 89.6 ± 224.8 N·m(-1)). Day-to-day differences in procedural factors explained 59.4% of the variance in day-to-day differences in K(HAM). Bilateral and unilateral K(LEG) was more consistent (ICC(2,3) range = .87-.94; SEM range = 1.0-2.91 kN·m(-1)) with lower absolute error (95% LOA bilateral= -2.0 ± 10.3; left leg = -0.36 ± 3.82; right leg = -1.05 ± 3.61 kN·m(-1)). K(LEG) explained 44% of the variance in K(HAM) (P < .01). Findings suggest that procedural factors must be carefully controlled to yield consistent and precise K(HAM) measures. The ease and consistency of K(LEG), and moderate correlation with K(HAM), may steer clinicians toward K(LEG) when measuring lower-extremity stiffness for screening studies and monitoring the effectiveness of training interventions over time.
Two Enhancements of the Logarithmic Least-Squares Method for Analyzing Subjective Comparisons
1989-03-25
error term. 1 For this model, the total sum of squares ( SSTO ), defined as n 2 SSTO = E (yi y) i=1 can be partitioned into error and regression sums...of the regression line around the mean value. Mathematically, for the model given by equation A.4, SSTO = SSE + SSR (A.6) A-4 where SSTO is the total...sum of squares (i.e., the variance of the yi’s), SSE is error sum of squares, and SSR is the regression sum of squares. SSTO , SSE, and SSR are given
Effect of non-normality on test statistics for one-way independent groups designs.
Cribbie, Robert A; Fiksenbaum, Lisa; Keselman, H J; Wilcox, Rand R
2012-02-01
The data obtained from one-way independent groups designs is typically non-normal in form and rarely equally variable across treatment populations (i.e., population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e., the analysis of variance F test) typically provides invalid results (e.g., too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non-normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e., trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non-normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non-normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non-normal. © 2011 The British Psychological Society.
Physiological correlates of 2-mile run performance as determined using a novel on-demand treadmill.
Tolfrey, Keith; Hansen, Simon A; Dutton, Katie; McKee, Tom; Jones, Andrew M
2009-08-01
The purpose of this study was to assess the reproducibility of an on-demand motorised treadmill to measure 2-mile (3.2 km) race performance and to examine the physiological variables that best predict this free-running performance in active men. Twelve men (mean (SD): age, 28 (9) years; stature, 1.79 (0.05) m; body mass, 72 (9) kg) completed the study in which maximum oxygen uptake (VO2 max), running economy, and running speedin the abstract section. They appear in the rest of the paper.), running economy, and running speed at VO2 max (vVO2 max), lactate threshold (vLT), and 4 mmol.L-1 fixed blood lactate concentration (v4) were measured. Subsequently, the maximal lactate steady state (MLSS) was identified using a series of 30-min treadmill runs. Finally, each participant completed a 2-mile running performance trial on 2 separate occasions, using an on-demand treadmill that adjusts belt speed according to the participant's position on the moving belt. The average 2-mile run speed was 15.7 (SD, 1.9) km.h-1, with small individual differences between repeat-performance trials (intraclass correlation coefficient = 0.99, 95% CI 0.953 to 0.996; standard error of measurement as coefficient of variation = 1.5%, 95% CI 1.0% to 2.5%). Bivariate regression analyses identified VO2 max, vVO2 max, VO2 (mL.kg-1.min-1) at MLSS, vLT, v4, and velocity at MLSS (vMLSS) as the strongest individual predictor variables (r2 = 0.69 to 0.87; standard error of the estimate = 1.08 to 0.72 km.h-1) for 2-mile running performance. The vLT and vMLSS explained 85% and 87% of the variance in running performance, respectively, suggesting that there is considerable shared variance between these parameters. In conclusion, the on-demand treadmill system provided a reliable measure of distance running performance. Both vLT and vMLSS were strong predictors of 2-mile running performance, with vMLSS explaining marginally more of the variance.
NASA Technical Reports Server (NTRS)
Natarajan, Suresh; Gardner, C. S.
1987-01-01
Receiver timing synchronization of an optical Pulse-Position Modulation (PPM) communication system can be achieved using a phased-locked loop (PLL), provided the photodetector output is suitably processed. The magnitude of the PLL phase error is a good indicator of the timing error at the receiver decoder. The statistics of the phase error are investigated while varying several key system parameters such as PPM order, signal and background strengths, and PPL bandwidth. A practical optical communication system utilizing a laser diode transmitter and an avalanche photodiode in the receiver is described, and the sampled phase error data are presented. A linear regression analysis is applied to the data to obtain estimates of the relational constants involving the phase error variance and incident signal power.
Some New Results on Grubbs’ Estimators.
1983-06-01
8217 ESTIMATORS DENNIS A. BRINDLEY AND RALPH A. BRADLEY* Consider a two-way classification with n rows and r columns and the usual model of analysis of variance...except that the error components of the model may have heterogeneous variances, by columns. -Grubbs provided unbiased estimators Q. of a . that depend...of observations yij, i = 1, ... , n, j 1, ... , r, and the model , Yij = Ili + ij + Ej, (1) when Vi represents the mean response of row i, . represents
Performance of Language-Coordinated Collective Systems: A Study of Wine Recognition and Description
Zubek, Julian; Denkiewicz, Michał; Dębska, Agnieszka; Radkowska, Alicja; Komorowska-Mach, Joanna; Litwin, Piotr; Stępień, Magdalena; Kucińska, Adrianna; Sitarska, Ewa; Komorowska, Krystyna; Fusaroli, Riccardo; Tylén, Kristian; Rączaszek-Leonardi, Joanna
2016-01-01
Most of our perceptions of and engagements with the world are shaped by our immersion in social interactions, cultural traditions, tools and linguistic categories. In this study we experimentally investigate the impact of two types of language-based coordination on the recognition and description of complex sensory stimuli: that of red wine. Participants were asked to taste, remember and successively recognize samples of wines within a larger set in a two-by-two experimental design: (1) either individually or in pairs, and (2) with or without the support of a sommelier card—a cultural linguistic tool designed for wine description. Both effectiveness of recognition and the kinds of errors in the four conditions were analyzed. While our experimental manipulations did not impact recognition accuracy, bias-variance decomposition of error revealed non-trivial differences in how participants solved the task. Pairs generally displayed reduced bias and increased variance compared to individuals, however the variance dropped significantly when they used the sommelier card. The effect of sommelier card reducing the variance was observed only in pairs, individuals did not seem to benefit from the cultural linguistic tool. Analysis of descriptions generated with the aid of sommelier cards shows that pairs were more coherent and discriminative than individuals. The findings are discussed in terms of global properties and dynamics of collective systems when constrained by different types of cultural practices. PMID:27729875
Hühn, M; Lotito, S; Piepho, H P
1993-09-01
Multilocation trials in plant breeding lead to cross-classified data sets with rows=genotypes and columns=environments, where the breeder is particularly interested in the rank orders of the genotypes in the different environments. Non-identical rank orders are the result of genotype x environment interactions. Not every interaction, however, causes rank changes among the genotypes (rank-interaction). From a breeder's point of view, interaction is tolerable only as long as it does not affect the rank orders. Therefore, the question arises of under which circumstances does interaction become rank-interaction. This paper contributes to our understanding of this topic. In our study we emphasized the detection of relationships between the similarity of the rank orders (measured by Kendall's coefficient of concordance W) and the functions of the diverse variance components (genotypes, environments, interaction, error). On the basis of extensive data sets on different agricultural crops (faba bean, fodder beet, sugar beet, oats, winter rape) obtained from registration trials (1985-1989) carried out in the Federal Republic of Germany, we obtained the following as main result: W ≅ σ 2 (g) /(σ 2 (g) + σ 2 (v) ) where σ 2 (g) =genotypic variance and σ 2 (v) = σ 2 (ge) + σ 2 (o) /L with σ 2 (ge) =interaction variance, σ 2 (o) =error variance and L=number of replications.
Using Scaling to Understand, Model and Predict Global Scale Anthropogenic and Natural Climate Change
NASA Astrophysics Data System (ADS)
Lovejoy, S.; del Rio Amador, L.
2014-12-01
The atmosphere is variable over twenty orders of magnitude in time (≈10-3 to 1017 s) and almost all of the variance is in the spectral "background" which we show can be divided into five scaling regimes: weather, macroweather, climate, macroclimate and megaclimate. We illustrate this with instrumental and paleo data. Based the signs of the fluctuation exponent H, we argue that while the weather is "what you get" (H>0: fluctuations increasing with scale), that it is macroweather (H<0: fluctuations decreasing with scale) - not climate - "that you expect". The conventional framework that treats the background as close to white noise and focuses on quasi-periodic variability assumes a spectrum that is in error by a factor of a quadrillion (≈ 1015). Using this scaling framework, we can quantify the natural variability, distinguish it from anthropogenic variability, test various statistical hypotheses and make stochastic climate forecasts. For example, we estimate the probability that the warming is simply a giant century long natural fluctuation is less than 1%, most likely less than 0.1% and estimate return periods for natural warming events of different strengths and durations, including the slow down ("pause") in the warming since 1998. The return period for the pause was found to be 20-50 years i.e. not very unusual; however it immediately follows a 6 year "pre-pause" warming event of almost the same magnitude with a similar return period (30 - 40 years). To improve on these unconditional estimates, we can use scaling models to exploit the long range memory of the climate process to make accurate stochastic forecasts of the climate including the pause. We illustrate stochastic forecasts on monthly and annual scale series of global and northern hemisphere surface temperatures. We obtain forecast skill nearly as high as the theoretical (scaling) predictability limits allow: for example, using hindcasts we find that at 10 year forecast horizons we can still explain ≈ 15% of the anomaly variance. These scaling hindcasts have comparable - or smaller - RMS errors than existing GCM's. We discuss how these be further improved by going beyond time series forecasts to space-time.
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, Andreas; Künsch, Hans Rudolf; Schwierz, Cornelia; Stahel, Werner A.
2013-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outliers affect the modelling of the large-scale spatial trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation (Welsh and Richardson, 1997). Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and non-sampled locations and kriging variances. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis a data set on heavy metal contamination of the soil in the vicinity of a metal smelter. Marchant, B.P. and Lark, R.M. 2007. Robust estimation of the variogram by residual maximum likelihood. Geoderma 140: 62-72. Richardson, A.M. and Welsh, A.H. 1995. Robust restricted maximum likelihood in mixed linear models. Biometrics 51: 1429-1439. Welsh, A.H. and Richardson, A.M. 1997. Approaches to the robust estimation of mixed models. In: Handbook of Statistics Vol. 15, Elsevier, pp. 343-384.
Bias correction for selecting the minimal-error classifier from many machine learning models.
Ding, Ying; Tang, Shaowu; Liao, Serena G; Jia, Jia; Oesterreich, Steffi; Lin, Yan; Tseng, George C
2014-11-15
Supervised machine learning is commonly applied in genomic research to construct a classifier from the training data that is generalizable to predict independent testing data. When test datasets are not available, cross-validation is commonly used to estimate the error rate. Many machine learning methods are available, and it is well known that no universally best method exists in general. It has been a common practice to apply many machine learning methods and report the method that produces the smallest cross-validation error rate. Theoretically, such a procedure produces a selection bias. Consequently, many clinical studies with moderate sample sizes (e.g. n = 30-60) risk reporting a falsely small cross-validation error rate that could not be validated later in independent cohorts. In this article, we illustrated the probabilistic framework of the problem and explored the statistical and asymptotic properties. We proposed a new bias correction method based on learning curve fitting by inverse power law (IPL) and compared it with three existing methods: nested cross-validation, weighted mean correction and Tibshirani-Tibshirani procedure. All methods were compared in simulation datasets, five moderate size real datasets and two large breast cancer datasets. The result showed that IPL outperforms the other methods in bias correction with smaller variance, and it has an additional advantage to extrapolate error estimates for larger sample sizes, a practical feature to recommend whether more samples should be recruited to improve the classifier and accuracy. An R package 'MLbias' and all source files are publicly available. tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Johnson, Jeffrey S; Spencer, John P
2016-05-01
Studies examining the relationship between spatial attention and spatial working memory (SWM) have shown that discrimination responses are faster for targets appearing at locations that are being maintained in SWM, and that location memory is impaired when attention is withdrawn during the delay. These observations support the proposal that sustained attention is required for successful retention in SWM: If attention is withdrawn, memory representations are likely to fail, increasing errors. In the present study, this proposal was reexamined in light of a neural-process model of SWM. On the basis of the model's functioning, we propose an alternative explanation for the observed decline in SWM performance when a secondary task is performed during retention: SWM representations drift systematically toward the location of targets appearing during the delay. To test this explanation, participants completed a color discrimination task during the delay interval of a spatial-recall task. In the critical shifting-attention condition, the color stimulus could appear either toward or away from the midline reference axis, relative to the memorized location. We hypothesized that if shifting attention during the delay leads to the failure of SWM representations, there should be an increase in the variance of recall errors, but no change in directional errors, regardless of the direction of the shift. Conversely, if shifting attention induces drift of SWM representations-as predicted by the model-systematic changes in the patterns of spatial-recall errors should occur that would depend on the direction of the shift. The results were consistent with the latter possibility-recall errors were biased toward the locations of discrimination targets appearing during the delay.
Testing a Dynamic Field Account of Interactions between Spatial Attention and Spatial Working Memory
Johnson, Jeffrey S.; Spencer, John P.
2016-01-01
Studies examining the relationship between spatial attention and spatial working memory (SWM) have shown that discrimination responses are faster for targets appearing at locations that are being maintained in SWM, and that location memory is impaired when attention is withdrawn during the delay. These observations support the proposal that sustained attention is required for successful retention in SWM: if attention is withdrawn, memory representations are likely to fail, increasing errors. In the present study, this proposal is reexamined in light of a neural process model of SWM. On the basis of the model's functioning, we propose an alternative explanation for the observed decline in SWM performance when a secondary task is performed during retention: SWM representations drift systematically toward the location of targets appearing during the delay. To test this explanation, participants completed a color-discrimination task during the delay interval of a spatial recall task. In the critical shifting attention condition, the color stimulus could appear either toward or away from the memorized location relative to a midline reference axis. We hypothesized that if shifting attention during the delay leads to the failure of SWM representations, there should be an increase in the variance of recall errors but no change in directional error, regardless of the direction of the shift. Conversely, if shifting attention induces drift of SWM representations—as predicted by the model—there should be systematic changes in the pattern of spatial recall errors depending on the direction of the shift. Results were consistent with the latter possibility—recall errors were biased toward the location of discrimination targets appearing during the delay. PMID:26810574
Risk prediction and aversion by anterior cingulate cortex.
Brown, Joshua W; Braver, Todd S
2007-12-01
The recently proposed error-likelihood hypothesis suggests that anterior cingulate cortex (ACC) and surrounding areas will become active in proportion to the perceived likelihood of an error. The hypothesis was originally derived from a computational model prediction. The same computational model now makes a further prediction that ACC will be sensitive not only to predicted error likelihood, but also to the predicted magnitude of the consequences, should an error occur. The product of error likelihood and predicted error consequence magnitude collectively defines the general "expected risk" of a given behavior in a manner analogous but orthogonal to subjective expected utility theory. New fMRI results from an incentivechange signal task now replicate the error-likelihood effect, validate the further predictions of the computational model, and suggest why some segments of the population may fail to show an error-likelihood effect. In particular, error-likelihood effects and expected risk effects in general indicate greater sensitivity to earlier predictors of errors and are seen in risk-averse but not risk-tolerant individuals. Taken together, the results are consistent with an expected risk model of ACC and suggest that ACC may generally contribute to cognitive control by recruiting brain activity to avoid risk.
Vallejo, Guillermo; Ato, Manuel; Fernández García, Paula; Livacic Rojas, Pablo E; Tuero Herrero, Ellián
2016-08-01
S. Usami (2014) describes a method to realistically determine sample size in longitudinal research using a multilevel model. The present research extends the aforementioned work to situations where it is likely that the assumption of homogeneity of the errors across groups is not met and the error term does not follow a scaled identity covariance structure. For this purpose, we followed a procedure based on transforming the variance components of the linear growth model and the parameter related to the treatment effect into specific and easily understandable indices. At the same time, we provide the appropriate statistical machinery for researchers to use when data loss is unavoidable, and changes in the expected value of the observed responses are not linear. The empirical powers based on unknown variance components were virtually the same as the theoretical powers derived from the use of statistically processed indexes. The main conclusion of the study is the accuracy of the proposed method to calculate sample size in the described situations with the stipulated power criteria.
Mullan, Barbara; Wong, Cara; Kothe, Emily
2013-03-01
The aim of this study was to investigate whether the theory of planned behaviour (TPB) with the addition of risk awareness could predict breakfast consumption in a sample of adolescents from the UK and Australia. It was hypothesised that the TPB variables of attitudes, subjective norm and perceived behavioural control (PBC) would significantly predict intentions, and that inclusion of risk perception would increase the proportion of variance explained. Secondly it was hypothesised that intention and PBC would predict behaviour. Participants were recruited from secondary schools in Australia and the UK. A total of 613 participants completed the study (448 females, 165 males; mean=14years ±1.1). The TPB predicted 42.2% of the variance in intentions to eat breakfast. All variables significantly predicted intention with PBC as the strongest component. The addition of risk made a small but significant contribution to the prediction of intention. Together intention and PBC predicted 57.8% of the variance in breakfast consumption. Copyright © 2012 Elsevier Ltd. All rights reserved.
Basalekou, M.; Pappas, C.; Kotseridis, Y.; Tarantilis, P. A.; Kontaxakis, E.
2017-01-01
Color, phenolic content, and chemical age values of red wines made from Cretan grape varieties (Kotsifali, Mandilari) were evaluated over nine months of maturation in different containers for two vintages. The wines differed greatly on their anthocyanin profiles. Mid-IR spectra were also recorded with the use of a Fourier Transform Infrared Spectrophotometer in ZnSe disk mode. Analysis of Variance was used to explore the parameter's dependency on time. Determination models were developed for the chemical age indexes using Partial Least Squares (PLS) (TQ Analyst software) considering the spectral region 1830–1500 cm−1. The correlation coefficients (r) for chemical age index i were 0.86 for Kotsifali (Root Mean Square Error of Calibration (RMSEC) = 0.067, Root Mean Square Error of Prediction (RMSEP) = 0,115, and Root Mean Square Error of Validation (RMSECV) = 0.164) and 0.90 for Mandilari (RMSEC = 0.050, RMSEP = 0.040, and RMSECV = 0.089). For chemical age index ii the correlation coefficients (r) were 0.86 and 0.97 for Kotsifali (RMSEC 0.044, RMSEP = 0.087, and RMSECV = 0.214) and Mandilari (RMSEC = 0.024, RMSEP = 0.033, and RMSECV = 0.078), respectively. The proposed method is simpler, less time consuming, and more economical and does not require chemical reagents. PMID:29225994
Situation awareness measures for simulated submarine track management.
Loft, Shayne; Bowden, Vanessa; Braithwaite, Janelle; Morrell, Daniel B; Huf, Samuel; Durso, Francis T
2015-03-01
The aim of this study was to examine whether the Situation Present Assessment Method (SPAM) and the Situation Awareness Global Assessment Technique (SAGAT) predict incremental variance in performance on a simulated submarine track management task and to measure the potential disruptive effect of these situation awareness (SA) measures. Submarine track managers use various displays to localize and track contacts detected by own-ship sensors. The measurement of SA is crucial for designing effective submarine display interfaces and training programs. Participants monitored a tactical display and sonar bearing-history display to track the cumulative behaviors of contacts in relationship to own-ship position and landmarks. SPAM (or SAGAT) and the Air Traffic Workload Input Technique (ATWIT) were administered during each scenario, and the NASA Task Load Index (NASA-TLX) and Situation Awareness Rating Technique were administered postscenario. SPAM and SAGAT predicted variance in performance after controlling for subjective measures of SA and workload, and SA for past information was a stronger predictor than SA for current/future information. The NASA-TLX predicted performance on some tasks. Only SAGAT predicted variance in performance on all three tasks but marginally increased subjective workload. SPAM, SAGAT, and the NASA-TLX can predict unique variance in submarine track management performance. SAGAT marginally increased subjective workload, but this increase did not lead to any performance decrement. Defense researchers have identified SPAM as an alternative to SAGAT because it would not require field exercises involving submarines to be paused. SPAM was not disruptive, but it is potentially problematic that SPAM did not predict variance in all three performance tasks. © 2014, Human Factors and Ergonomics Society.
Transforming RNA-Seq data to improve the performance of prognostic gene signatures.
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.